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Author*s Preface 


In this book are presented the fundamental laws and principal 
methods of quantum mechanics, together with some applications. 
For this reason only a few indicative notes will be found on the 
vast field of spectroscopy (with the exception of the theory of the 
hydrogen spectrum, which has a fundamental character). Further¬ 
more, the theoretical interpretation of the periodic system of the 
elements, the statistical theories, the theory of the chemical bond, 
and problems of atomic collisions are not included. 

Instead we have dwelled somewhat at length on the epistemo¬ 
logical foundations of the new atomic mechanics. For anyone 
starting a study of this field, a major difficulty is the comprehension 
of its epistemological position, which is both unusual and profound. 
To help overcome this difficulty an exponent of the theory may 
choose various approaches. These range from the intuitive treat¬ 
ment (of necessity less rigorous, being based on analogies and quali¬ 
tative discussions) to the strictly logical approach, which is bound 
to be abstract and formal. The former will be especially agreeable 
to mentalities of the visual’^ type but will leave dissatisfied the 
critical mind with a logical inclination. The situation is reversed 
for the latter approach. In practice, it is convenient to strike a 
compromise in such a manner as to satisfy all types of readers. In 
this book, the author has attempted to resolve various exigencies 
by dividing the work into three parts of increasing orders of 
abstraction. 

The first part serves to give a historical, overall view of the 
evolution of quantum mechanics. The approach here, as intuitive 
and elementary as is possible, describes some of the more important 
experiments upon which the theory is based, and gives a first glimpse 
of that theory. 

The second part, of a more advanced nature, is preceded by a 
mathematical introduction. Here the principles of quantum 
mechanics in the particular form due to Schrodinger (wave mechau’^ 
ics) are established on the basis of the uncertainty principle. The 
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most imi)()rtant problems involving a single particle are treatc^d by 
this method. The old (piantum theor}^ of l^ohr and Sommerfeld, 
which is still very useful from the didactic standpoint, is developed 
in the last chapter of the second part, rather than before wave 
mechanics as is customary. In fact, it appears that this arrange¬ 
ment, though not in historic^al order, is most logical—because the 
])ostidates of the Bohr-Sommerfeld theory do not seem to be a 
priori logically justifiable, though they may be obtained as a first 
aiiproximation from the laws of wave mechanics. 

Finally, in the third part, the principles of quantum mechanics 
in their most general form are presented {transformation theory). 
More advanced mathematical methods are used here (TIilbert space 
and matrices), which are briefly explained in the introductory 
chaj)ter. HowTver, in this last part also, recourse to the more 
abstract mathematical methods was avoided, but at the expense of 
maximum generality and rigor. Anyone desirous of a more pro¬ 
found exposition of quantum mechanics, from the point of view of 
its logical structure, is referred to the works of Neumann and Weyl 
cited in the Bibliography. 


Enrico Pkrsico 



Translator’s Preface 


The physical constants occurring in this book have been replaced 
by their most recent values, obtained mainly from J. W. M. DuMond 
and E. R. Cohen, Rev. Mod. Phys. 20 , 82 (1948) and ibid. 21 , 651 
(1949), and R. T. Birge, Rev. Mod. Phys. 13 , 233 (1941); the refer¬ 
ences pertaining to these constants have been brought up to date. 

A number of supplementary notes and corrections supplied by 
Professor Persico were incorporated into the book. I have taken the 
liberty to insert additional footnotes and occasional sentences 
throughout the text whenever I thought it advisable in the interest 
of clarity; some changes were warranted by recent developments 
in the field of atomic physics. The list of books in the Bibliography 
has been enlarged by including a number of new references to books 
in the English language, and titles of books already listed have been 
supplemented by those of their English translations in cases where 
such translations exist. Otherwise, the original text has been left 
unaltered with regard to both content and style insofar as the differ¬ 
ence in idiom would permit. 

I am grateful to Professor Birge for help concerning the general 
constants, and to Odette Temmer for her aid in the preparation of 
the manuscript. 

Georges M. Temmer 
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CHAPTER 1 
The Atomic Model 

1. The concept of the atom in chemistry and physics. Although 
the hypothesis of a corpuscular structure of matter had been enun¬ 
ciated since earliest times, it became necessary for the interpretation 
of experimental facts only when the fundamental chemical laws of 
constant proportions and of multiple proportions were discovered. 
Indeed, these laws naturally lead one to consider every simple body 
to be made up of identical particles. These particles can combine 
in various ways, forming innumerable compounds, but they always 
occur as indivisible entities in all chemical phenomena—as atomSj in 
the etymological sense of the term. Chemistry permits us to deter¬ 
mine the ratios of the weights of the various atoms with great accu¬ 
racy, or to express all atomic weights in definite units in terms of one 
of them (notably of the oxygen atom). It does not, however, enable 
us to compare the weight of an atom with the weights of ordinary 
bodies; hence we cannot obtain atomic weights in fractions of a 
gram. Chemistry gives even less indication of the size and shape of 
atoms; it is content with the realization that atoms are so small as 
to escape direct observation. Therefore in order to understand the 
major part of chemical phenomena, it is sufficient to represent the 
atoms, for instance, as minute, hard, indivisible spheres (whose 
diameter is not of interest) possessing a particular kind of attractive 
force by means of which they can group themselves into molecules. 
The latter, when-in turn assembled into large aggregates, constitute 
the bodies we see. 

This oversimplified conception of the structure of matter is not 
only sufficient to account for chemical phenomena but also permits 
us to develop the theory of many physical phenomena. An example 
is the kinetic theory of gases, which obtains a large part of its results 
by considering the molecules as simple points or spheres, or aggre¬ 
gates of spheres. Furthermore, it is found that the mass m of the 
individual molecules enters into many formulas of the kinetic theory 
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of gases, but that in the final expressions destined to be compared 
with experience, m appears abnost invariably multiplied by the total 
number of molecules. Actually, then, only the total mass of the gas 
occurs; consequently, the major part of the theory is independent of 
the size attributed to the molecules, and similarly of their number. 
It is therefore not possible to draw any conclusions concerning these 
quantities. For this reason, up until fairly recently, the absolute 
size of molecules and atoms was ignored in the development of 
atomic theory. 

Later on, however, phenomena were discovered in which the 
mass of a single molecule occurs directly, or at least Avogadro's 
number Nj the number of molecules contained in one gram-molecular 
weight.^ The Brownian movement is such a phenomenon. These 
phenomena have made possible the experimental determination of 
the number that, so to speak, represents the key for the passage 
from the world of ordinary bodies to the world of the atoms. It 
has even been possible to find Avogadro’s number by about 15 inde¬ 
pendent methods. The fact that under the atomic hypothesis 
results have been obtained that agree within the limits of experi¬ 
mental error is the best proof of its validity.^ 

The dimensions of molecules or atoms enter into another group 
of phenomena (for example the viscosity of gases), and hence the 
latter have made it possible to determine these dimensions, at least 
as far as their order of magnitude is concerned. It was found that 
atoms have diameters of the order of 10~® cm and that the distances 
at which atoms arrange themselves in forming molecules are of the 
same order, so that for molecules dimensions of the order of lO*”* cm 

' As is known, one gram-molecular weight is the number of grams equal to 
the molecular weight. Therefore, if Avogadro^s number N is known, the 
weight in grams of a single molecule is obtained by dividing the molecular 
weight by N, Similarly, the weight in grams of one atom is obtained by 
dividing the atomic weight by A. 

* The most reliable value (1947) of Avogadro^s number is A 6.0235 X 10**, 
with an uncertainty of several units in the last significant figure [J. W. M. 
DuMond and E. R. Cohen, Rev. Mod. Phys. 21, 651 (1949)]. From this value 
may be derived the unit adopted by the chemists, namely, H e of the atomic 
weight of oxygen (a mixture of various isotopes, cf. §4), equal to 1/A 1.6602 X 

10*“*^ g; the weight of one hydrogen atom differs very slightly from this value 
(1.6732 X 10”*^ g). In physics a slightly smaller unit is often used to express 
atomic weights, namely, of the weight of the most abundant isotope of 
oxygen. Multiplication by 1.00027 of weights expressed in chemical units 
converts them to the corresponding weights on the physical scale. 
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likewise result. Later on it will be seen how the modern theories 
furnish values for them with remarkable precision in the various 
cases. 

However, the concept of atoms as simple and inert fragments of 
matter is incapable of explaining a large number of physical phenom¬ 
ena. Indeed, the emission and absorption of light, magnetic 
phenomena, X rays, radioactivity, phenomena that occur upon 
the passage of electricity through gases, electrolysis, the photo¬ 
electric effect, the thermionic effect, and others, all lead us to regard 
atoms as more or less complex mechanisms, composed of still smaller 
parts and containing positive and negative electric charges, normally 
in equal quantities. These electric charges always occur as multi¬ 
ples of an elementary charge e. Thus we are led to attribute a 
corpuscular structure to electricity as well as to matter, and to con¬ 
sider the electrical corpuscles as constituents of atoms. Atoms 
have thus lost their original significance of “indivisible” particles 
(indeed, diverse phenomena are known today, such as the rather 
common one of ionization, in which the atom is “divided”), and 
are instead considered to be mechanical systems, formed from several 
particles of different kinds (positive and negative electrons, protons, 
neutrons), with which we shall deal later on. Some of these particles 
are charged positively, others negatively, and still others are neutral; 
all are in motion with respect to each other under the action of 
mutual forces. 

This complex concept of the atom as a structure containing 
electricity in motion, together with the electromagnetic nature of 
light, opens the possibility of understanding the interaction of mat¬ 
ter and radiation as represented by such phenomena as emission and 
absorption, scattering, and resonance. Indeed, it is known from 
electromagnetic theory that the motion of electric charges can pro¬ 
duce electromagnetic waves which, in turn, may set in motion elec¬ 
tric charges upon which they are falling. Furthermore, since 
electric charges in motion generate a magnetic field, we may foresee 
the possibility of explaining the magnetic properties of matter in 
this way. It may even be observed that one of the first not entirely 
static conceptions of the atom (or molecule) was formulated by 
Ampere for the express purpose of reducing magnetic phenomena 
to electric currents circulating among the smallest constituents of 
matter. 
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Nevertheless, as we shall see shortly, these ample possibilities, 
which we can now perceive upon a superficial examination of the 
question, were realized only with considerable difficulty and led 
further than could have been suspected at first sight. 

2. The electron. The negative electric particles, or electronSj 
have long been known to be the constituents of cathode rays as well 
as of beta rays from radioactive substances. Furthermore, they 
are emitted from incandescent metals {thermionic effect) and from 
many other substances when they are struck by light of short wave¬ 
length {photoelectric effect) or by X rays. The electrons obtained in 
these various ways are always identical in all their properties. By 
means of the deflection suffered in an electric and magnetic field, 
it has been possible to measure the ratio e/m of charge to mass, so 
that the value of m may be obtained from an independent measure¬ 
ment of e. The electronic charge e has been measured with great 
precision by C. T, R. Wilson and R. A. Millikan, by the ‘^cloud- 
chamber method” and the “oil-drop method,” respectively. The 
latter method permits the determination of the charge of the elec¬ 
trons taken one at a time, and from its results we can assert that the 
charges of all electrons are actually equal to each other, and that the 
value found is not just an average value.® The most reliable values 
of the charge (in absolute value) and mass of the electron are (1947) 

Charge: e = 4.8029 X 10~^^ e.s.u. — 1.602 X 10"^® e.m.u. 

Mass (for low velocities): m^ = 9.1055 X 10~^® g. 

The mass turns out to be about 1837 times smaller than the mass 
of the hydrogen atom. For high velocities, it is found that the 
ratio e/m suffers a decrease proportional to \/l — v^/c^^ which is in 
agreement with the theory of relativity (if it is assumed, as seems 
reasonable, that the charge e remains unchanged), according to 
which the mass m should vary as 

mo 

fjl = —=x=r==^ 

VT^ {v^c^) 

where mo is the value of the mass for low velocities. 

3 For a description of these experiments, sec Nos. 27, 28, and 31 of the 
Bibliography. 

* For a critical discussion of these and other physical constants, cf. R. T, Birge, 
Rev, Mod. Phys, 13, 233 (1941) and Reports on Progress in Physics, Vol. VIII, p. 90, 
1942; J, W. M. DuMond and E. R. Cohen, Rev, Mod. Phys. 20, 82 (1948) and 
21, 651 (1949). We adopt DuMond and Cohen’s values, unless otherwise stated. 
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It has been determined that in electrolysis a gram-molec¬ 
ular weight of monovalent ions transports a charge of 96,490 
coulombs, equaling 2.8927 X 10^"^ c.s.u. Dividing this quantity by 
Avogadro’s number (6.023 X we obtain the charge of each 

monovalent ion, which coincides with the value recently reported 
for the electronic charge.® This agreement leads to the conclusion 
that the electric charge of monovalent ions is due to an electron 
being addeni to or removed from a neutral atom, and similarly that 
ions of higher valences are atoms having a certain number of elec¬ 
trons more or less than neutral atoms. In short, the electrons 
observed as free particles in cathode rays, in beta rays, and so on, 
are identified with those manifesting themselves in electrolytic 
phenomena. 

Furthermore, atoms always contain electrons of the kind 
observed in the free state, as is proved by many facts which will be 
discussed presently. We shall confine these considerations to point¬ 
ing out one of the proofs which historically has been of the greatest 
importance: in the Zeeman effect, the ratio c/m is determined experi¬ 
mentally for the electric particle to whose motion the emission of 
light is attributed, and a value is found which agrees with the one 
already determined for electrons by the deflection method. Thus 
we recognize the presence of electrons within atoms and their direct 
participation in the phenomena of light emission. 

3. The positive particles. The Rutherford atom. Although the 
facts cited above, along with others, have for a long time definitely 
established the presence of negative electricity inside the atoms in 
the form of electrons, we have yet to see the form in which the posi¬ 
tive electricity is found which is necessary to give the atom its 
characteristics of an electrically neutral system. In 1902, J, J. 
Thomson advanced the hypothesis that the atom is made up of a 
homogeneous sphere of positive electricity within which the electrons 
are immersed, like dust particles in a water drop. By the mutual 
action between positive and negative charges, the electrons would 
be attracted (as may easily be shown) toward the center of the 
sphere, with a force proportional to the distance, and would occupy 
positions of equilibrium under the action of that force and their 
mutual repulsions. However, this hypothesis was readily shown to 
be inadequate for the interpretation of the experimental facts. 

* Here we have one of the best ways of determining N, 
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The most direct refutation of the Thomson hypothesis was made 
through a famous experiment performed by Lord Ernest Rutherford 
in 1911 on the passage of alpha particles through matter. These 
particles, which are known to be doubly ionized (that is, positively 
charged) helium atoms, emitted by radioactive substances with 
velocities of the order of 10® cm/sec, are able to traverse thin 
metallic foils by virtue of their small size and high velocity. When 
a narrow beam of alpha particles strikes a foil, it does not proceed in 
a straight line but is scattered into various directions, thereby prov¬ 
ing that alpha particles are deviated upon traversing matter—some 
more, some less, and some even directly backwards. These devia¬ 
tions cannot be due to the col¬ 
lisions of alpha particles with 
electrons contained in the sub¬ 
stance traversed, because the 
latter have a mass several thou¬ 
sand times smaller than that of 
the alpha particles and hence 
cannot deflect them appreci¬ 
ably. Consequently, these de¬ 
flections must be due to the ac¬ 
tion of the remainder of the 
atom, that is, to the part which, 
in addition to the positive 
charge, contains almost the en¬ 
tire mass of the atom. If one 
accepts the Thomson model, it 
is easily seen that the repulsion exerted by the sphere of positive 
electricity upon the alpha particles increases at first upon approach¬ 
ing it, but when the particle penetrates into the interior of the 
sphere, the force decreases as the particle nears the center. The 
resulting trajectory will be curved in the manner shown in Fig. 1(a). 

The deviations obtained in this way are not very large and are 
insufficient to explain the experimental results. Therefore, Ruther¬ 
ford enunciated the hypothesis (which is universally accepted today) 
that the positive part of the atom consists of a nucleuSy so small com¬ 
pared with atomic dimensions that one may consider it to be almost 
a point, in which nearly the entire mass of the atom is concentrated, 
not merely the positive charge. The repulsive force which the 
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nucleus exerts upon the alpha particle will then be inversely propor¬ 
tional to the square of the distance and will increase indefinitely 
when the particle approaches the nucleus. Hence it is understand¬ 
able that a considerably larger deviation must result than in the 
Thomson model [Fig. 1(b)]. A calculation of the angular distribu¬ 
tion of the scattered beam of alpha particles yields a distribution in 
perfect agreement with the observed one (with the exception of 
those particles which pass very close to the nucleus, for which the 
nucleus may no longer be considered a point). Experience therefore 
confirms Rutherford’s hypothesis of the point nucleus.” It is even 
possible, from the results of these experiments, to deduce the charge 
of the nucleus. In this manner the rather remarkable fact is dis¬ 
covered that the charge of the nucleus (except for the sign) is a multi¬ 
ple of the electronic charge equal to the atomic number^ or order 
number in the periodic system of Mendelyeev, of the element 
considered. 

On the basis of these results, Rutherford proposed a model of the 
atom which has been of fundamental importance in the development 
of theoretical physics. He considered the atom of atomic number 
Z to be constituted of a point nucleus in which almost the entire 
mass is concentrated, having a positive charge Ze (calling e the 
absolute value of the electronic charge) around which are found Z 
electrons, whose negative charges exactly compensate for the posi¬ 
tive charge of the nucleus. In accordance with Coulomb’s law, 
these electrons are attracted by the nucleus with a force inversely 
proportional to the square of the distance, similar to the force of 
gravity the sun exerts upon the planets. They will therefore move, 
according to Kepler’s laws, in elliptical orbits (if we neglect the 
mutual repulsions between electrons); whereas the nucleus, because 
of its large mass, will remain practically at rest. The whole atom 
will thus represent a planetary system in miniature. The hydrogen 
atom, for which Z = 1, will have a nucleus of charge e and a single 
planetary electron. (This electron will describe an exactly elliptical 
orbit, since it lacks the perturbations of other electrons.) The 
helium atom will have a nucleus of charge 2e and two planetary elec¬ 
trons, and so on, up to curium (Z == 96), which will have a nuclear 
charge 96^ and 96 electrons. (Elements with Z = 43, 61, 85, 87, 
93, 94, 95, and 96 are unstable and do not occur in nature.) 

According to this model, the quantity which is rather vaguely 
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called t?ie ^‘diameter of the atom’^ (and which we have seen to be 
of the order of 10“® cm) must be approximately identified with the 
maximum dimensions of the outer orbits. We can say that the 
atom very nearly occupies a sphere of that diameter; we under¬ 
stand, of course, that this sphere is not filled with solid substance 
but is almost entirely empty, and has particles traveling within it 
whose dimensions are very small compared with the distances 
between them.® 

This model, which dominated atomic mechanics up until 1925, 
must not be interpreted as a faithful picture of intra-atomic reality 
but only as an approximation. Nevertheless, it retains great impor¬ 
tance as a heuristic or didactic means, and it serves to furnish the 
basis for a convenient and expressive terminology, of which we shall 
make current use. 

4. The structure of the nucleus. Isotopes. According to the 
Rutherford model, the valence and, in general, the chemical proper¬ 
ties of an atom depend exclusively on the motion of the outermost 
electrons. The same can be said about the optical properties, such 
as, for example, the emission and absorption spectra. The inner¬ 
most electrons intervene in determining the behavior of the atom 
with respect to X-ray spectra of emission and absorption. Finally, 
in atoms with radioactive properties, we must suppose that the 
latter properties reside within the nucleus, as has been confirmed by 
the fact that these properties are absolutely insensitive to chemical 
bonds, which alter the motion of the outermost electrons. They are 
also unaffected by X rays which act upon the ones closest to the 
nucleus, and in general by all physical agents known, excepting 
collisions with extremely penetrating particles or radiations. 

This fact leads us to regard the nucleus not as an elementary 
particle such as the electron but as a generally complex system; this 
viewpoint is confirmed by various other circumstances, one of which 
is the existence of isotopes. 

Although all elements can be considered to be simple bodies 
from the chemical standpoint, they may not be so regarded from the 
physical point of view. It turns out that elements may be further 
decomposed into as many as ten substances having identical chem- 

®The radius of the nucleus is of the order of cm; the radius of the 
electron is considered not to exceed cm. 
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ical properties (with different atomic weights) and almost identical 
physical properties, with the exception of the possibility of radio¬ 
active properties, which are radically different. Each of these sub¬ 
stances is called an isotope. The name, which means “same place” 
in Greek, is derived from the fact that in the periodic table, the 
various isotopes belonging to the same element must be thought of 
as being located in the same cell, since each cell corresponds to 
definite chemical properties. 

Each isotope is made up of identical atoms, but the atoms of 
various isotopes differ in weight and other characteristics: thus it is 
the isotopes, not the elements, which correspond to separate atomic 
species. An element is generally constituted of a mixture of atoms 
of as many species as there are isotopes making up the element. 
The proportions of this mixture remain the same in all the com¬ 
pounds involving this element, as a consequence of the identity of 
the chemical properties of the isotopes belonging to the same ele¬ 
ments. It turns out that these proportions are approximately the 
same no matter what the mineral from which the substance is 
obtained. 

The existence of isotop(‘s was first discovered through the radio¬ 
active elements. More recently, the work of Aston has shown that 
many of the nonradioactive substances, which were thought to be 
simple, are in reality mixtures of two, three, or more isotopes of 
different atomic weight. These mixtures behave like simple sub¬ 
stances, since the various isotopes constituting them have almost 
perfectly identical physical and chemical properties. Hence the 
atomic weights as determined by chemists are really the result of an 
average of the atomic weights of the various isotopes. The latter 
have been determined with great precision, by physical methods 
which we shall not discuss here. The most important result of 
these investigations is that the atomic weights of the single isotopes 
are always integers to a good approximation. The nonintegral 
values occurring in the usual table of chemical elements derive from 
the fact that these weights do not refer to really simple substances 
but to mixtures, as has been pointed out. Thus, potassium, for 
instance, with atomic weight 39.1, is actually a mixture of two iso¬ 
topes of atomic weights 39 and 41. The only physical properties by 
which two isotopes may appreciably differ are radioactive proper- 
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ties (both spontaneous and induced), and therefore the discovery 
and separation of isotopes has been much easier among naturally 
radioactive elements. 

The interpretation of isotopes with the Rutherford model is as 
follows. The isotopes of the same element have the same atomic 
number, hence the same nuclear charge and the same number of 
planetary electrons, but they differ in the mass of the nucleus. 
Since the motion of the electrons is determined by electrical forces, 
and since the mass of the nucleus, which remains almost constant 
from one isotope to the next, exerts a very little influence upon the 
motion of the electrons, the latter move in almost the same manner 
in ah isotopes; hence their identical physical and chemical proper¬ 
ties. An exception is made by possible radioactive properties, 
which, as has been stated, do not depend on the planetary electrons 
but only on the structure of the nucleus, which is different in differ¬ 
ent isotopes. 

The radioactive properties and their theoretical interpretation 
by means of appropriate hypotheses concerning the structure of 
nuclei constitute the subject matter of nuclear physics. This field, 
which received its theoretical basis only in recent years, has made 
enormous progress in an extremely short time, especially on the 
experimental side with the discovery of new disintegrations and of 
artificial radioactivity. During the Second World War, nuclear 
physics entered the field of practical applications with the invention 
of nuclear piles and of atomic bombs, which utilize the energy of 
atomic nuclei, commonly called ^‘atomic energy.” However, the 
physics of the nucleus is not within the scope of the present volume,^ 
since in the questions of atomic physics with which we shall be con¬ 
cerned, the nucleus will always enter as a single particle of negligible 
dimensions. This nucleus will be characterized solely by the value 
of its mass and of its positive electric charge (or else by the atomic 
number). Nevertheless, attention should be called to the fact that 
this theory of nuclear structure (which is in a stage of active develop¬ 
ment) makes use of the same general principles of atomic mechanics 
which are treated in this volume. 

We shall give here a list of the various kinds of elementary (or 
supposedly elementary) particles which have so far been discovered 

^See F. Rasetti, Elements of Nuclear Physics, New York: Prentice-Hall, 
Inc., 1936; also No. 20b of the Bibliography, 
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experimentally. We shall give their masses expressed in atomic 
mass units of the atomic mass of the most abundant isotope of 
oxygen), and the electric charge expressed in terms of the elementary 
charge e = 4.8024 X 10“^° e.s.u. 


Heavy particles 


Intermediate particles 
Light particles 



Mass 

Charge 

proton 

1.007574 

+e 

neutron 

1.00893 

0 

positive mesons 

^0.11; 0.16 


jiegative mesons 

--0.11; ^0.16 

—e 

positive electron 

0.00054862 


negative electron 

0.00054862 

—e 


According to modern views, the nuclei of all atoms are aggregates 
of protons and neutrons (in particular, the nucleus of hydrogen con¬ 
sists of a proton only). 

The mesons (positive and negative) are still relatively little 
known; they constitute a part of cosmic radiation, and have only 
recently (1948) been obtained artificially. For each sign of the 
charge, there exist at least two types of masses equal to about 200 
and about 300 electron masses, respectively, or equaling 0.11 and 
0.16 atomic mass units. These values, however, are known with 
considerably less accuracy than those of the other particles. The 
name mesons(or mesotrons derives from the fact that their 
mass is intermediate between that of the electron and that of the 
heavy particles. 

The positive electrons (or positrons) and negative electrons (or 
negatrons) have the same mass and equal charges in absolute value. 
However, the symmetry between these two particles is not complete: 
although the negative electrons are found in all atoms (whose extra- 
nuclear part they constitute, as has been stated) and can be obtained 
very easily in the free state (beta rays, thermionic and photoelectric 
effects), the positive electrons have a very short life, having a 
tendency to unite with negative electrons (annihilation). Positrons 
are found in cosmic rays and are emitted by some artificially radio¬ 
active substances. Hence it is not surprising that the positron was 
discovered only in 1932, whereas the negative electron has been 
known since 1897. 

The particles which we have listed are those whose existence has 
been cofifirmed experimentally; to these one must add some particles 
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whose existence is considered plausible for theoretical reasons, 
although no direct experimental confirmation exists as yet. They 
are the ncMrino^ of negligible mass compared to the elecdron mass, 
and of zero charge; and the neutretto, or neutral meson, of mass 
intermediate between that of the electron and the proton, and of 
zero charge. 

6. The radioactive displacement law. The identity of the 
atomic number with the nuclear charge is confirmed by the so-called 
radioactive displacement law. Indeed, it is found that when a radio¬ 
active substance emits negative beta rays, it is transformed into 
another having the same atomic weight but an atomic number 
greater by one unit (that is, it has been displaced” one cell ahead 
in the periodic system). As a matter of fact, the nucleus, in losing 
one electron, docs not vary appreciably in weight, whereas its posi¬ 
tive charge increases by e. t)n the other hand, when a substance 
emits alpha rays, it is transformed into another having an atomic 
weight smaller by four units and an atomic number smaller by two 
units (that is, it has been displaced backward by two cells on the 
periodic chart). An alpha particle (which is known to be identical 
with a helium nucleus) has a weight of four and a charge 2c; hence 
the weight of the nucleus which emitted it must diminish by four 
units, and the charge must decrease by 2e. 

In particular, if two beta emissions and one alpha emission follow 
each other in any order, the atomic weight should diminish by four 
but the nuclear charge should remain unchanged; we should thus 
end up with another isotope of the same element. This process 
actually occurs in certain cases. For example, radium D {A ~ 210, 
Z = 82), emitting a beta particle, is transformed into radium E 
(A = 210, Z = 83), which in turn emits another beta particle, thus 
becoming polonium (A = 210, Z = 84), which finally, by emitting 
an alpha particle, is transformed into radium G (A = 206, Z = 82). 
Now it is found that radium G has the same chemical properties as 
radium D; that is, they are both isotopes of the same clement (lead). 

A radioactive displacement law analogous to the one referring 
to negative beta rays also holds for the emission of positrons, which 
takes place in many artificially radioactive substances: in that case 
the isotope, without appreciably changing in atomic weight, is dis¬ 
placed backward by one cell; that is, its atomic number diminishes 
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by one unit, in siccordiincc with the loss of an elementary positive 
charge. 

All these facts plainly show that all chcmi(!al and physical prop¬ 
erties of an atom ((‘xeept radioactive properties) which determine its 
position on tin; periodic table are themselves determined by the 
value of its nuclear charge, and that the “atomic number,” origi¬ 
nally defimal as an order number on that talrle, can b(; identified with 
tlu! number of elementary charges carried by the nucleus. 



CHAPTER 2 


Energy Quanta and Light Quanta 

6. Inapplicability of classical laws to atoms. Although the 
Rutherford model of the atom is very attractive because of its 
simplicity, it leads to results in full disagreement with experience if 
we assume that its behavior is governed by the ordinary laws of 
mechanics and electricity. 

For instance, according to the laws of electromagnetic theory, an 
; electric charge radiates electromagnetic energy whenever it possesses 
a certain acceleration. Thus an electron which rotates around the 
nucleus, having a constant centripetal acceleration, should con¬ 
stantly radiate electromagnetic waves. Consequently, its energy 
should gradually diminish; this would lead to a gradual reduction 
of the dimensions of its orbit so that the electron would finally fall 
into the nucleus. Therefore the Rutherford atom could not have a 
permanent character; and its life could be calculated to be of the 
order of lO""* sec. 

Furthermore, the energy would be emitted in the form of radia¬ 
tion whose fundamental frequency would coincide with the fre¬ 
quency of the orbital motion of the electron. But since the latter 
frequency would be continually varying because of the shrinking of 
the orbit, the emitted light would have a changing frequency; there¬ 
fore any body, containing countless atoms in all possible phases of 
their *4ives,^^ should emit radiation of all possible frequencies, or a 
continuous spectrum. However, it is known that gases emit line 
spectra of rigorously constant frequency. 

When these difficulties presented themselves to the physicists, 
the latter had already been convinced for other reasons that classical 
mechanics and electromagnetic theory did not apply in the atomic 
domain, so that, rather than abandon the Rutherford model, they 
sought to find laws which would make it function in such a manner as 
to account for the experimental results. These laws were proposed 
for the first time by Niels Bohr in 1913, and subsequently stated in a 

16 
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more general form by A. Sommerfeld. However, prior to an explora¬ 
tion of these laws, we must rapidly outline those other fields in 
which contradictions with classical laws had already been encoun¬ 
tered, and the attempts at the substitution of new laws. The first 
of these attempts, which opened the road to modern atomic physics, 
was that of the German physicist Max Planck, who in 1900 tried to 
give a theory of black-body radiation. An additional contribution 
concerning the nature of light rather than of matter was made by 
Einstein in 1905, introducing the concept of ^Tight quanta^' or 
^'photons.” 

7, The spectrum of the black body. It is well known that 
incandescent bodies emit light; or, more generally, bodies emit 
radiation at any temperature, visible at higher temperatures but 
invisible for lower temperatures (infrared radiation). The type of 
radiation emitted depends, to a certain extent, on the nature of the 
body. However, if the latter is a black body,^^ that is, a body 
capable of absorbing all radiation it receives (such as lampblack, to 
a certain degree of approximation), one can show thermodynami¬ 
cally, and experiment confirms the result, that the radiation emitted 
at a given temperature is independent of the nature and of the shape 
of the body. When this radiation is analyzed with a spectroscope, 
it shows a continuous spectrum, whose intensity has been measured 
exactly at all points and at various temperatures. The intensity 
exhibits a maximum for a certain frequency Vm and decreases toward 
zero for high as well as for low frequencies, as is indicated by the 
diagram in Fig. 2. 

It is natural to suppose that the emitted radiation is due to the 
thermal agitation of the electric charges contained within matter; 
but if we try to express this idea quantitatively by applying the 
ordinary laws of mechanics and electromagnetic theory, we arrive 
at an absurd result. Instead of the experimental curve, we find an 
ascending parabolic curve which, near the origin, coincides with the 
experimental curve but then deviates from it in the manner shown 
in Fig. 2. The parabola is represented by the so-called Rayleigh- 
Jeans formula 



where k is Boltzmann's constant, c is the velocity of light, T is the 



18 HISTORICAL DEVELOPMENT AND EXPERIMENTAL BASES [§7 

absolute temperature, and I{v) dv represents the energy radiated 
per unit area in the spectral band between v and v dv. According 
to this formula, the total energy radiated at any temperature differ¬ 
ent from absolute zero is infinite, as can be seen immediately by 
integrating with respect to v. 

Various attempts were made to escape this difficulty; but so long 
as use was made of classical mechanics and electromagnetic theory, 



they invariably led to the Rayleigh-Jeans formula, even when the 
particular mechanism of emission was modified in various ways. 
While making one of these attempts, toward the end of 1899, 
Planck^ discovered at which point one had to break away from the 
classical treatment in order to reach a result conforming fully to 
experience. 

He started out with the hypothesis that in atoms there are con¬ 
tained tiny oscillators^ of all possible frequencies, in constant vibra¬ 
tion (each with its proper frequency); these oscillators continually 

^ Verh. d. D. Phys. Ges. 2, 237 (1900); Ann, d. Physik 4, 553 (1901). 

* An oscillator is a system formed by an electrically charged mass point 
(such as an electron) attracted to a fixed center with a force proportional to 
the distance. It is known from mechanics that such a point executes iso¬ 
chronous oscillations about the fixed center, with a frequency independent of 
amplitude and characteristic of the oscillator. However, Planck’s reasoning 
does not necessarily require this model but remains valid under more general 
assumptions. 
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absorb and emit energy in the form of electromagnetic waves. 
Through an artifice of calculation, Planck attacked the problem by 
assuming that the energy of each oscillator varies not continuously 
but in steps of a certain magnitude €, reserving the right of letting 
the latter tend toward zero later. But he found that if he permitted 
€ to go to zero, he would again be led to the Rayleigh-Jeans formula; 
on the other hand, when he let € retain a finite value proportional to 
the proper frequency of the oscillator, by setting 

e = hv (1) 

(h being constant), he arrived at the formula 


2Trv^ hv 


( 2 ) 


which, for a fixed parameter T, represents a curve of the type shown 
in solid line in Fig. 2. Furthermore, upon assigning to h a con¬ 
venient numerical value, which according to the latest determina¬ 
tions® is, in c.g.s. units, 

h = 6.6237 X 10-27^ 

formula (2) exactly represents the energy distribution in the spec¬ 
trum of the black body for any value of the temperature T, Follow¬ 
ing this surprising result, Planck proposed the bold hypothesis that 
energy, like matter, possesses an atomic structure; that is, it occurs 
always in amounts that are multiples of an elementary quantity c, 
which he called a quantum,^ The latter must have, according to 
what has been said, a value proportional to the frequency with 
which the system under consideration is able to oscillate. 

Thus there entered for the first time into science the concept of a 
discontinuity in the physical laws, in contrast to the ordinary 
mechanical and electromagnetic theories. The old theories were 
called classical, in order to distinguish them from the new ones, 
called quantum theories. Corresponding to this new concept, there 
entered into physics a new universal constant A, called Planches 
constant, or quantum of action, whose importance was subsequently 
revealed to be very great in a multitude of phenomena. 

The extreme hypothesis of an atomic structure for energy wa^ 

* Cf. J. W. M. DuMond and E. R. Cohen, loc, ciL 

* Latin word used in German in a sense meaning quantity,’^ “dose.^^ 
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soon reduced in scope by Planck himself, who showed that it is 
sufficient to assume a discontinuity only in the process of emission 
and not in the nature of the energy itself. Although, even in that 
form, the hypothesis was repugnant to the minds of many scientists, 
the success of its further developments in accounting for the specific 
heats of solids (theories of Einstein and Debye, 1907 and 1912) 
confirmed many people in their conviction that the hypothesis was 
a step in the right direction. 

Today the formula of Planck can be justified by means inde¬ 
pendent of the oscillator hypothesis. In the case of the oscillator, 
the hypothesis of energy quanta appears as a particular consequence 
of rather general laws. These laws are based upon hypotheses 
which seem considerably more satisfactory and plausible than the 
original hypothesis of quanta, because they are devoid of contra¬ 
dictions both within themselves and with the laws which ordinary 
bodies obey; they also account in unique fashion for a very large 
number of phenomena. 

8. The photoelectric effect and “light quanta.” While in the 
field of the structure of matter, Planck^s hypothesis gives rise to the 
idea that classical laws would not be applicable to the atom, Ein¬ 
stein pointed out a no less serious contradiction between certain 
experimental facts and the wave theory of light. 

This contradiction, to which we shall turn in greater detail below, 
essentially consists in the following. It is known that if radiation 
of sufficiently high frequency (in general, ultraviolet light, X rays, 
or gamma rays) is made to fall upon a metallic surface, the latter 
emits electrons with a certain energy, which is without a doubt 
imparted to them by the incident radiation. The phenomenon is 
called the 'photoelectric effect in the case where the exciting radiation 
consists of visible or ultraviolet light. We shall, however, use the 
term also in the more general case. Now it is found that the energy 
which each single electron receives from the radiation (an energy 
which is partly used to tear the electron from the metal and partly 
remains with the electron as kinetic energy) is independent of the 
light intensity but depends solely upon its frequency; the intensity 
icdfiuences only the number of electrons emitted (which is propor¬ 
tional to the intensity) and not the energy with which each electron is 
emitted. Thus one is forced to abandon the most natural hypoth¬ 
esis: that the electrons are constrained to oscillate under the action 
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of the electric field of the light and that, when their oscillations 
reach sufficiently large amplitudes, they are severed from the atom 
to which they belong and are ejected. However a more serious 
difficulty presents itself independently of any particular hypothesis 
concerning the mechanism of the photoelectric effect: when the 
intensity of the light is diminished, a point can be reached where the 
energy incident upon each atom during the entire experiment 
(calculated under the assumption that the energy falls uniformly 
upon the whole surface) is considerably below the kinetic energy 
with which the electron is ejected; yet even under these conditions 
the electrons are being emitted with the same kinetic energy. 

The rather remarkable law which relates the kinetic energy m 
with which the electrons are emitted to the frequency v of the 
incident radiation is 

w = hv -- Wo, (3) 

where h is Planck^s constant and Wo is a constant characteristic of 
the metal in question. For values of v which would make the right- 
hand side negative (that is for v < vo, where vo = Wo/h), there is no 
emission. Thus the photoelectric effect exhibits the curious 
property of setting in suddenly at a certain frequency vo which is 
called the photoelectric threshold and is, like Wo, a characteristic of 
each metal. This law also obstinately frustrates every attempt at 
classical justification, although the presence within it of the same 
constant h which occurs in the theory of the black body gives an 
indication of its relation to some profound law of nature. 

Einstein, in a celebrated paper,showed that the strange 
behavior of the photoelectric effect is consistent vdth a corpus¬ 
cular model of radiation. He considered this model solely for its 
heuristic value, however, without emphasizing its concrete signifi¬ 
cance. Indeed, if it is assumed that the energy of monochromatic 
radiation is not uniformly distributed over the entire wave front 
but instead travels localized in corpuscles (called light quanta by 
Einstein, also called photons^ today) each of which contains a 
quantity of energy hv and travels with the velocity of light, the 
difficulties of the photoelectric phenomenon disappear and we 

® “ tTber einen die Erzeugung und Verwandlung des Lichtes betreffenden 
heuristischen Gesichtspunkt.” Ann, d, Physik 17, 132 (1906). 

«This designation is not to imply that the hypothesis applies only to light, 
strictly speaking; it includes every type of electromagnetic radiation. 
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arrive immediately at formula (3). As a matter of fact, the paradox 
concerning photoelectric emission with weak light is explained right 
away when we consider that when the light is weak, there will be 
few light quanta but each will always contain the energy hv. Hence 
there will be few atoms receiving a quantum, but each of those few 
atoms will always receive an energy independently of the light 
intensity. Formula (3) is then immediately justified if we interpret 
Wo as the energy the electron has to expend to break away from the 
metal. Indeed, the electron, which has received the energy hv 
from the light quantum, will, upon expending an amount Wo of this 
energy to escape from the metal, emerge with a residual energy 
hv — Wq in the form of kinetic energy. If the energy of the light 
quantum is below Wo (that is, if < vo)y the electron will not be able 
to escape, as has been actually found. 

It is to be noted that the hypothesis of light quanta’^ is not 
exactly a return to the old corpuscular (or ballistic) theories of light, 
since the energy of light quanta docs not necessarily have to be 
thought of as kinetic energy of a particle in motion. Attempts 
have been made to interpret the energy of light quanta as electro¬ 
magnetic energy localized in a small region of space, but it is not 
possible to reconcile such a view with the classical equations of 
Maxwell. 

We may observe, however, that the hypothesis of ‘‘light quanta ’’ 
is distinct from the hypothesis of the quanta of Planck: the former 
deals with the manner in which energy travels through space; the 
latter refers to the manner in which energy is emitted and absorbed 
by matter. Nevertheless, it is evident that these hypotheses 
strengthen each other, and that both groups of phenomena to which 
they refer make it clear that a profound physical relation must exist 
between the frequency of a radiation and the quantity hv ol 
energy. 

As will be said later on, the hypothesis of light quanta has 
subsequently been confirmed by other experimental facts, and in 
present theory the concept of light quantum or 'photon plays as 
important a part as that of the electron and the proton. 

It is known that the electromagnetic theory of light proves that 
with a radiant energy W there must be associated an electromagnetic 
momentum W/c, which manifests itself, when absorbed by matter, 
by the phenomenon of light pressure (cf. bibliography 32a, §142 and 
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§143). Hence it is natural that every theory of light, in order to be 
in accord with the facts, must associate a momentum W/c with 
radiant energy W, Thus, in addition to an energy hv^ we must 
attribute to photons, independently of any concrete corpuscular 
concept, a momentum p of magnitude 


hv h 



in the direction of propagation of the light.A further experimental 
proof of the exactness of this hypothesis is furnished by the Compton 
effect and by certain experiments related to it, which we shall now 
discuss briefly. 

9. The Compton effect. When a beam of X rays traverses 
matter, part of it will be scattered in all directions in the manner of 
light which traverses a turbid medium. Analysis of this scattered® 
radiation with a crystal spectrograph demonstrates that it does not 
possess a spectral constitution identical to the primary radiation. 
If, for instance, the latter is monochromatic of frequency v, the 
spectrum of the scattered radiation will exhibit, in addition to the 
line of frequency v, a line of slightly lower frequency v\ This line 
has considerable intensity, even superior to the intensity of the 
‘^undisplaced'^ line, if the scattering substance is of low atomic 
number (such as carbon or aluminum) and if the incident rays are 
“hard" (high v)^ whereas it is rather weak or is missing altogether 
if the substance has high atomic number (such as silver) and if the 
incident rays are soft. This phenomenon of scattering with a slight 
increase of wavelength was discovered in 1923 by the American 
physicist A. H. Compton and carries his name. 

The change in wavelength is different for radiation scattered in 
the different directions: it is zero in the direction of the primary rays, 
increases with the angle 6 between the direction of observation and 

’ We remember from relativistic mechanics (cf. §§184 ff of No. 32a of the 
Bibliography) that a poi nt p article having a rest mass m© has, at velocity v, a 
mass m » mo/'\/l — (v®/c*), a momentum p =» mr, and a kinetic energy 
w ^ {m — nioic^. These formulas can also be applied to the photon, provided 
that we set » « c, m = m© =* 0. Thus we can think of the photon as a 

limiting case of a point particle, having a rest mass zero and a velocity equal to c. 

* With the actually scattered radiation there is mixed in the secondary 
radiation whose spectrum is the one characteristic of the scattering substance; 
this is easily distinguished from the scattered one, and we shall not be con¬ 
cerned with it here. 
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that of the primary rays, and reaches its maximum value (which 
amounts to only 0.048 A.) for 6 = 180*^, that is, when the scattered 
radiation is observed straight back. This variation is well repre¬ 
sented by the formula 

X' = X + 0.024(1 - cos d), (4) 

where X and X' must be expressed in angstrom units. As can be 
seen from this formula, the change in wavelength X' — X is inde¬ 
pendent of X, so that the relative effect is appreciable for small values 
of X, or for “hard^’ rays, whereas it becomes negligible when X is 
large compared with 0.048 A. 

The classical theories give no explanation of the Compton effect, 
since according to these theories the scattering is due to the fact that 
the electrons contained in the scattering substance execute forced 
oscillations under the influence of the alternating electric field of the 
incident radiation, and hence in turn become centers of emission for 
waves. However, these oscillations must take place with the fre¬ 
quency of the incident radiation, and hence the scattered radiation 
must have precisely the same frequency. 

10, Quantum theory of the Compton effect. According to the 
theory of quanta, the scattering process must instead be thought of 
as the effect of collisions between photons of the incident radiation 
and the electrons of the scattering substance. After collision, the 
photons are deviated from their original direction just like the 
alpha particles in the Rutherford experiment which we have already 
discussed. This collision, no matter what its mechanism, is 
governed by the two fundamental laws of conservation of energy 
and momentum, as if we were dealing with a collision between per¬ 
fectly elastic bodies. Hence it is clear that the electron acquires a 
certain velocity in the impact and thus removes a certain amount of 
energy from the incident photon, which is therefore scattered with 
less energy. And since the energy of the photons is related to their 
frequency by the relation E = hv^ the scattered photon must have 
a frequency lower than that of the incident photon. In this way 
the Compton effect is explained qualitatively. 

We may now set down this reasoning quantitatively. Let us 
call V the velocity acquired by the electron that has been struck; 
then its kinetic energy will be, according to relativistic mechanics, 
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mc^(l/\/l — (v^/c^) — 1 ), where m is the res^ mass of the electron);® 
whereas the energy of the incident quantum is hv and the energy of 
the scattered quantum is hv\ The principle of conservation of 
energy yields 

hp = hv' + mC^ ( - 7 :— , . 1 . -- (5) 

VVi ~ (^Vc^) ) 


Let us take three vectors (see Fig. 3) to represent, respectively, 
the momentum of the incident photon (vector AO^ of length hv/c)y 

B 



the momentum of the scattered photon (vector 05, of length hv' fc)^ 
and the momentum acquired by the electron (vector 0Z>, of length 
mv/\/l — {v'^/c'^)). We observe that the first vector must be the 
resultant of the other two. From the triangle 050, using the 
cosine law, we obtain immediately 


mV hvhv^ 

r-zr(^) =-^+ -^ - 2 --co8^. 

Eliminating v between equations (5) and ( 6 ) gives 
, hvv' 

. y ~ _ (1 — cos 6 ). 


( 6 > 

(7) 


By expressing the frequencies in terms of the wavelengths, that is, 
setting V = c/X, v' = c/X', we obtain, after simplifying, 

X' = X + A (1 _ cos e). (8) 

* If we had used the classical expressions for kinetic energy and momentum^ 
we should have arrived at results which are practically indistinguishable from 
those reached by relativistic mechanics, but the formulas would be somewhat 
more complicated. 
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If we substitute the numerical values for h, 7n, and c, we obtain 

-- = 2.43 X 10-« cm = 0.0243 1, 

me ^ 

so that equation (S) coincides exactly with (4), which describes, as 
has been said, all experimental results. 

In this theory it has becui assumed that the scattering elecd.rons 
are free. In reality thc}^ are more or less bound io the nuc^leus of 
the atom to which they belong, and hence the entii'e atom (and 
possibly even an entire molecule) participates to a ccaiain ext(uit in 
the collision. In order to see the effect of this circumstance, we 
shall consider the extreme case where the electron is rigidly bound to 
its atom. It will then behave like a body, not of mass m but of mass 
Af equal to the mass of the atom, that is, several thousand tim(‘s 
larger. The preceding reasoning is still valid, providc'd that we 
substitute Af for 771, Hence in the final formula (8), the constant 
h/'77ic, which determines the entire effect, will become several thou¬ 
sand times smaller, wliich means that the Compton ('ffect will 
become negligible. This consideration justifies the presenc(j of the 
tiiidlsylaccd li7ie^ which is due to the electrons more strongly bound 
to the nucleus, whereas the displaced line is duo to tliose peripheral 
electrons which can be considered to be practically free. For a 
given frequency v in elements of high atomic number, that is, with 
high nuclear charge, it is natural that the effect of the bound elec¬ 
trons (which are more numerous) should prevail, whereas in the 
light elements almost all the electrons behave as if they were free. 
It is also clear that, in a given substance, the electrons can be con¬ 
sidered the freer, the higher the energy of the protons that impinge 
upon them, or else the higher the frequency r. As has been said, 
all this is in perfect agreement with experience. 

11. The experiment of Bothe and Geiger. The preceding theory 
of the* Compton effect (which is due to Compton and Debye) 
received valid support in an experiment performed by Bothe and 
Geiger in 1925.^^ It purported to verify whether, simultaneously 
with the scattering of the photon, an electron is actually projected 
(knock-on electron), as the theory predicts. In order to detect both 
the scattered photons and the knock-on electrons, these investigators 

W. Bothe and H. Geiger, Zeits. /. Fhysik 32, 639 (1925). 
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used a point counter^ an instrument that had already been employed 
by Rutherford and Geiger to count the alpha particles of radium. 
It consists of a special ionization chamber capable of giving an 
electrical signal whenever a single electron or photon or any ionizing 
particle enters it. In the experiment of Bothe and Geiger^ two 
point counters were placed on either side of a narrow beam of X rays 
traversing hydrogen. One of the two counters was covered with a 
thin platinum foil able to stop the electrons but not the X rays; 
hence this counter registered only the photons scattered by the 
hydrogen. The other counter registered the knock-on electrons, 
and in addition the photons scattered to its side of the beam, as well 
as the electrons occasionally produced by the photoelectric effect. 
Therefore a complete coincidence between the signals from the two 
counters could not be expected, and in fact the second counter gave 
quite a few more signals than the first. However, the experimenters 
observed the remarkable result that for each signal of the first 
counter there was a simultaneous one in the second counter—proof 
that for each photon scattered to one side there is always a knock-on 
electron propelled to the other side, in perfect coincidence. 

12. The experiment of Compton and Simon. An even more 
significant experiment for the confirmation of the preceding theory 
of the Compton effect was performed by Compton and Simon in 
1925.^^ They used the cloud-chamber method of Wilson, which per¬ 
mits the trajectories of the electrons traversing a gas to be made 
visible in the form of fine droplet tracks. In this manner these 
investigators succeeded not only in revealing the existence of the 
knock-on electron photographically but also in verifying that the 
direction in which it is projected is the one predicted by the laws of 
elastic collision. 

We shall now see what relations between the direction OB of the 
scattered photon and the direction OD of the knock-on electron are 
given by these laws, that is, between the angles d and 6' (Fig. 3). 
Instead of writing down relations (6) between only the magnitudes 
of the three vectors in question, we shall use the principle of con¬ 
servation of momentum in full by expressing the fact that the pro¬ 
jection of the momentum in the direction AO and its projection in a 
direction perpendicular to AO must be conserved. We then write, 
instead of (8), the two equations 

A. H. Compton and A. W. Simon, Phys, Rev. 26, 289 (1925). 
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hv hv' . . mv 

- = - cos 6 i-COS 6 , 

C C -y/l — (v'^/c^) 

^ hv' . ^ mv . 

0 = - sin ^-pr=::rrr=r-= SIR 6 , 

C ^/l - (v^/c^) 


whence (leaving only the last term on the right-hand side and divid¬ 
ing the second by the first equation) 


v' sin d 
V — v' cos d 


tan B', 


If we then substitute for v' the expression obtained from (7), we 
have, after a simple transformation, 


tan 6' 


-- __ _ ___ 

1 + {hv/mc^^) 2 


(9) 


This is the desired relation. From it we find, in particular, that 
when we vary 6 from 0 to 180°, $' varies from 90° to 0°. Hence the 

electrons are always scattered 
forward. 

In order to verify this formula 
experimentally, Compton and 
Simon took a large number of 
stereoscopic photographs with the 
cloud chamber containing a gas 
traversed by a narrow beam of 
X rays. Many of these photo¬ 
graphs caught a scattering act and 
appear as shown schematically in 
Fig. 4, where XX represents the 
beam of primary X rays. From a point 0 of the beam there starts 
the track OA of a knock-on electron. The scattered photon cannot 
leave a track along its trajectory; it becomes visible when it encoun¬ 
ters an atom C which absorbs it, since that atom then emits an elec¬ 
tron by the photoelectric effect, and the track CD of the electron is 
visible in the photograph. By connecting its origin C with 0, we 
obtain the direction in which the photon has been scattered (dashed 
line). The tangent OB to the track OA, drawn near its origin, 
gives the initial direction of the knock-on electron. By means of 
the two stereoscopic photographs we can thus determine the values 
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(in space) of the angles 6 and and it is found that they satisfy 
equation (9) within the limits of experimental error. 

More recently, Crane, Gaertner, and Turin performed analogous 
experiments with gamma rays (instead of X rays), measuring directly 
(by deflection in a magnetic field) the velocity of the knock-on 
electron, as well as the angles 6 and S'. These experiments are also 
in accord with theory. 

13. Impossibility of a literally corpuscular theory of light. The 

hypothesis that light is of a corpuscular nature (that it is composed 
of particles in the intuitive sense of the word) seems almost to be 
dictated by the phenomena of which we have spoken in the preced¬ 



ing sections. Nevertheless, it encounters serious difficulties in 
another and no less vast category of phenomena: those which have 
constituted, from the time of Huygens on, the experimental basis of 
the wave theories of light. The most important of these phenomena 
are interference and diffraction. 

In order to realize the nature of these difficulties, let us consider 
a particularly simple case of interference, that of the Fresnel mirrors 
(Fig. 5). Let S be the source, A and B the mirrors, and N a point 
such that the optical paths SAN and SBN differ by an odd number 
of half wavelengths. Let us suppose at first that mirror B is 
covered by a screen. Then there will be light at Ny or else (using 
corpuscular terminology) particles which have followed the line 
SAN will arrive at N. If we uncover mirror By a dark fringe is 
produced at Ny whereas at other points the illumination is rein¬ 
forced (bright fringes). We might suppose that the particles that 
came along the line SAN are being influenced, from a distance, by 
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the covering up of mirror B and are being deviated. However, this 
hypothesis, besides being strange in itself, does not explain why, 
upon observing the fringes in another plane x'x' instead of plane xx, 
we may find a bright fringe at N\ that is, on the extension of ray 
AN, We must, then, suppose that the particles coming from A 
have passed through N but have been rendered ineffective in some 
manner at that point by the simultaneous presence of particles com¬ 
ing from B. But this hypothesis, besides leaving unexplained the 
fact that in other points of the screen there is reinforcement of 
illumination, is contradicted by the experimental fact that the 
phenomena of interference are still taking place when the light is so 
weak that the apparatus is traversed by one particle at a time. 
Thus we are forced to repudiate the assumption that the particles 
may interact in such a manner as to render each other ineffective in 
certain points and to reinforce each other at other points. 

1 A proof that interference occurs even with only one quantum at a time 
can be found in the fact that astronomical telescopes give clear images 
even of the very weakest stars, and it is known that images in optical 
systems are formed by a process of interference. Stars of 10th magnitude 
are perfectly visible with a telescope of 10 cm aperture; their radiation has 
an intensity of about 3.5 X 10“^° erg/cmVsec. 

Assuming light of an average wavelength X = 5500 A and calculating, 
the energy corresponding to each quantum by using the relation E == A?' 
we find 36 X lO”^® erg; hence the intensity of a lOth-magnitude star is 
approximately 100 quanta per cm^ per second. There are thus 8000 quanta 
entering the telescope every second. If we consider that each quantum 
traverses the telescope in a time of the order of 10“® sec, we see that the 
individual quanta traverse the telescope separately, with large intervals 
between them. The image formation under these conditions would imply 
that each single quantum uses the entire aperture of the objective in turn, 
and would lead us to attribute to the quanta transverse dimensions of 
many centimeters or several meters, in contrast with the experiments on 
the photoelectric effect, which led us to assign to them atomic dimensions. 

The interference of single quanta has further been proved by direct 
experiments of G. J. Taylor^^ and of Dempster and Batho^®; the latter have 
obtained, by means of an echelon interferometer, interference fringes even 
when the light was so weak that certainly each quantum entered the appa¬ 
ratus after the preceding one had left it. 

In conclusion, every theory of light which is corpuscular in the 
literal sense of the word, that is, which attributes definite trajectories 

^^Proc, Camh. Phil. Soc. 16, 114 (1909). 

^^Phys. Rev. 30, 664 (1927). 
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to light quanta as when dealing with material particles, has turned 
out to be absolutely irreconcilable with experimental facts, just as 
the wave theory of Maxwell has proved to be in contrast with the 
photoelectric effect and with many other experimental facts. We 
shall see at a later stage (§34) how a profound revision of the princi¬ 
ples of the wave theory and the corpuscular theory has succeeded in 
eliminating this contrast by establishing a unified basis and leaving 
to each of the two old theories that portion of validity which is in 
accord with experimental facts. 



CHAPTER 3 


Energy Levels 

14. The spectrum of hydrogen and of hydrogenlike systems. 

Spectroscopy, in addition to furnishing the largest amount of 
information about atoms, is characterized by very high precision in 
yielding wavelength measurements, which can often be performed 
with a relative error below 1 part per million. 

Of all the spectra, the first to have been interpreted was that of 
atomic hydrogen. This spectrum is the simplest, in accordance 
with the fact that the hydrogen atom has the simplest structure. 

If we take a spectroscope and observe the light emitted by a 
Plticker tube containing hydrogen, we perceive a large number of 
lines that can easily be classified into two categories. There is a 
background made up of numerous and closely spaced lines of low 
intensity. In contrast are a small number of easily visible lines, 
whose intensity increases considerably faster than that of the back¬ 
ground when the current through the tube is raised. It has been 
ascertained that the numerous weak lines (which constitute the 
so-called many-line spectrum) are emitted by the H 2 molecules, 
whereas the more intense ones (which form the so-called Balmer 
spectrum) are due to the H atoms, which are produced in the tube by 
the effect of dissociation caused by the passage of current. The 
many-line spectrum, being emitted by the molecules, has a compli¬ 
cated structure and has only recently been interpreted theoretically; 
we shall concern ourselves here only with the Balmer spectrum, 
referring to it exclusively when speaking of the hydrogen spectrum. 

This spectrum is made up of lines partly in the infrared region, 
partly in the visible region, and partly in the ultraviolet region. It 
is represented schematically in Fig. 6, which assumes a dispersion 
proportional to the frequency. As is evident from the figure, these 
lines are clustered into three groups, called Paschen series (infrared), 
Balmer series (visible) and Lyman series (ultraviolet), respectively. 

The first step toward the theoretical interpretation of spectra 

32 
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was made in 1885 by the Swiss scientist Balmer, who observed that 
all the lines of the series of hydrogen which carries his name (and 
which was the only one known then) have frequencies^ that can be 
represented by the formula 

in which 72 is a constant, now called the Rydberg constant^ whose 
numerical value is 72 = 109,678 cm“^, and n is an integer which can 


Jr^rared Visible Ultraviolet 





Pascherv Balrner 
series series 


S S 
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I . .. I . .. . .—/• 

Lyman series 


Fig. 6 

take on all values from 3 on. As can readily be seen, when the 
value of n increases, the values of v given by formula (10) tend 
toward the limit 72/4; in fact, the lines of the Balmer series condense 
toward a limit, so that beyond a certain point it is no longer possi¬ 
ble to distinguish between them. Nevertheless, more than 30 of 
them have been resolved, for all of which the frequency very exactly 
satisfies Balmer^s formula. 

^ In spectroscopy it is an established practice to call frequency” of a 
radiation not only the tnie frequency v (number of vibrations per second, 
related to the wavelength by = c/X) but also a quantity proportional to it ; 

« I' 1 


This is also more properly called ‘‘wave number,” since it represents the num¬ 
ber of wavelengths in one centimeter. The use of “wave number” is generally 
preferred to the use of frequency itself, since X can be measured directly with 
great precision, sometimes better than 1:1,000,000; it is sufiicient to take its 
reciprocal to get ?, whereas v has to be obtained from X by the formula v = c/X. 
Since c is known with a precision little above 1:100,000 (c « 2.99776 X 10^® 
cm/sec), V turns out to be known with somewhat lower precision than ?. 

To avoid misunderstandings, it will suffice to add to the designation of 
“frequency” the unit of measurement, which is sec”^ for the true frequency v, 
and is cm““^ for wave numbers ?. Generally, the wave number v is also indi¬ 
cated by the simple letter v. 
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In the following discussion, it will be recognized that Balrner's 
formula is merely a particular case of a more general formula repre¬ 
senting all the lines of the spectrum of atomic hydrogen. This 
formula is 



where R is the same constant as above, and n' and n are two integers. 
Taking 7i' — 1 and n = 2, 3, 4, . . . , we obtain the frequencies of 
the Lyinan series: 

-?) 

Taking 7i' = 2 and n = 3, 4, 5, , we again obtain (10), which 

represents the Bahncr series. Finally, taking n' — 3 and n ~ 4, 5, 
G, . . . , we obtain the freciueiKues of the Paschen series: 



If we take n' = 4, we obtain another infrared series, called the 
Brackett series, of which only a few lines are known. 

The simplicity of formula (11), the remarkable number of lines 
which it represents, and the extreme precision with which it fits 
experimental results practically rule out the possibility of accidental 
coincidence. Hence, attempts were made to interpret the formula 
with an appropriate atomic model, but for a long time these encoun¬ 
tered apparently insuperable difficulties. 

Later on, the discovery was made that spectra strictly analogous 
to the hydrogen spectrum are emitted by atoms of the lighter ele¬ 
ments, when they are ionized in such a way that they have lost all 
their electrons but one—that is, by the ions He+, Li++, 

3+4-4-+ These ions, being made up of a nucleus and a single elec¬ 
tron, have a structure analogous to the hydrogen atom, from which 
they differ only in the mass of the nucleus and in its charge, which, 
instead of being e as in the hydrogen atom, is Ze for the element of 
atomic number Z. These systems are collectively termed hydrogen- 
like systems. It can be foreseen that to this structural analogy 
there must correspond an analogy of spectral properties; in fact, 
there is reason to suppose that the spectrum of an atom of atomic 
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number Z, ionized (Z — 1) times, will be represented by a formula 
analogous to (11): 

and hence will differ from the hydrogen spectrum only because the 
frequencies of all lines are multiplied by Z^ (except for a slight cor¬ 
rection due to the movement of the nucleus, of which we shall 
speak in §58 of Part II). For example, the He"^ ion (Z = 2) emits 
a series called the Pickering series^ given by 

(‘ 2 ) 

whose frequencies are four times those of the infrared {Brackett) 
series of hydrogen and fall in the region of visible light rather than 
in the infrared. Several lines of this series have been observed 
also some lines of Li'^+, Be'^'^'^ and all satisfying formula (11')- 

Finally we shall point out the fact that the lines of hydrogen and 
those of hydrogenlike ions, upon observation with instruments of 
high resolving power, each prove to be composed of a group of 
several very closely spaced lines which constitute the so-called 
fine structure of the lines. In hydrogen the difference in wavelength 
of the various components is, at most, of the order of a few tenths of 
an angstrom unit and hence is detectable only by methods of high 
resolution. In ionized helium, however, it reaches several ang¬ 
stroms, and therefore the fine structure of these lines can be detected 
even with a grating. We shall turn to this phenomenon later on in 
order to give its theoretical explanation. 

Many lines of other spectra further show a so-called hyperfine 
structure, which requires means of very high resolution and has an 
entirely different origin from the preceding fine structure. 

2 The Pickering series, first observed only in the spectrum of a star, was 
attributed to hydrogen, because its lines of even position coincide with the 
lines of the Balmer series, as can be seen by observing that (12) can also be 
written 

” ” ® (4 “ 

Subsequently, since the Bohr theory predicted that He"^ must emit such a 
series, these lines were examined again and were found in the spectrum emitted 
by a tube containing helium which was absolutely devoid of hydrogen. 
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16. Spectral series. Spectroscopic terms. Guided by the dis¬ 
coveries of the hydrogen series, the spectroscopists started to look 
for analogous regularities in the spectra of other elements, and in 
several other cases they succeeded in finding that the lines are 
grouped into series. The lines of the same series are characterized 
by a general appearance similar in structure, intensity, and sharp¬ 
ness, and by an analogous behavior with respect to various physical 
influences (such as magnetic fields) that may act upon the source. 
For many series they also found empirical formulas analogous to 
the Balmer formula. 

In all these formulas—^just as in Balmer^s—^the frequency of a 
line occurs as the difference between two termSy of which the first 
one remains constant, whereas the second assumes different values 
(corresponding to the successive integers) for the different lines of a 
given series. The terms may quite often be represented by the 
formula (given by Rydberg) 


Tn = 


n 

(n — 


(13) 


where R is the Rydberg constant and a another constant. By 
holding a fixed and assigning to n all integral values from a certain 
value on up, we obtain a sequence of terms. In particular, this 
formula contains the Balmer terms (for which a = 0): 


Tn 


R 


A given element usually presents different sequences of terms, 
each one being characterized by a given value of a; and the fre¬ 
quencies of a series of lines are obtained in general by combining a 
fixed term of a sequence with all the terms of another sequence. 
Thus the formula that yields the frequencies of a series is of the 
type 


(n' - a'Y (n - aY 

where n' is fixed and n assumes all integral values above a certain 
value. 

However, for many elements the terms tn have a more complex 
form than that of Rydberg. A second type, for instance, is repre¬ 
sented by the so-called Ritz formula 
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in-a + Kry 

where K is another constant characteristic of each sequence. 

Finally, in many cases, though it may not be possible to repre¬ 
sent the terms Tn by simple formulas, the frequencies of the various 
spectral lines may nevertheless be written in the form 

V = Tn' — Tn, 

thus causing many frequencies of the same substance to depend on 
a rather limited number of terms. Hence we can say that the first 
step in the interpretation of the spectrum of a substance consists in 
searching, within the list of frequencies of its spectral lines, for the 
more limited list of ^Herms” which, when combined, give rise to the 
frequencies. As we shall see in the following section, the terms of a 
spectrum have an important physical significance not possessed by 
the frequencies of the single lines. 

Since, by taking the difference of a relatively small number of 
terms, we can arrive at a considerably larger number of frequencies, 
it is evident that large numbers of lines of constant frequency 
difference must exist in a spectrum. This fact, which was dis¬ 
covered empirically by Ritz, carries the name of combination prind- 
pie. By suggesting the search for terms, it furnishes the key for the 
interpretation of complex spectra. 

16. Note on the Bohr theory of the hydrogen atom. The fact 
that the frequencies of spectral lines can generally be represented 
by the difference of two terms received, in 1913, an interpretation 
by Bohr^ which has since been confirmed by other experimental 
facts and has proved to be very fertile; it still constitutes the basis 
of spectroscopy. Bohr proposed this theory to interpret the Bal- 
mer series and related ones (to which we shall refer in our explana¬ 
tion of the theory), but his fundamental concept of energy levels can 
be extended to all spectra. 

According to the Rutherford model, the hydrogen atom consists 
of a nucleus and of an electron which describes a Keplerian ellipse 
about it, or, in particular cases, a circle. For simplicity, we shall refer 
here to the circular case, as does Bohr; and since the theory applies 
to aJI hydrogenlike systems, we shall immediately consider a hydro- 

^Phil Mag. W, 1 (1913), 



38 HISTORICAL DEVELOPMENT AND EXPERIMENTAL BASES [§16 


genlike system of atomic number Z. It is known from mechanics 
that the radius r of the orbit is determined by the initial conditions 
of the motion and can have any value whatever, so long as the 
velocity v is such that the centrifugal force equals the electrostatic 
attraction of the nucleus, or mv^/r = Z so that v is related to r 
by 

V = 



The energy of the system (kinetic energy plus potential energy) 
can then be expressed as a function of r alone, and we get 



= 1 

r 2 r 



1 Ze^ 

2 r 


(14') 


(The negative value of E signifies that it is necessary to do work in 
order to break up the atom by removing the electron to an infinite 
distance and bringing it to rest—that is, in order to ionize the atom 
without imparting any velocity to the electron.) 

Now Bohr assumed that in atomic mechanics there exists a 
supplementary condition, according to which motion is not possible 
on all the circles permitted by ordinary mechanics but only on some 
of them, called '^stable orbits'’ or quantum orbits,” whose radii 
constitute an infinite but '^discrete” (that is, not continuous) 
sequence ri, r 2 , . . . . This supplementary condition was formu¬ 
lated by Bohr as follows. The angular momentum (or moment of 
momentum) M = mrv has to be an integral multiple of h/2Tr: 


M = n^ (n = 1,2,3, . . .)• (15) 

Now, because of (14), 

ilf = \/mrZ 


so that, upon substituting into the preceding expression and solving 
for r, we find that the radius of the /ith orbit, which we shall indicate 
by an, is 


ttn = 


4ir^ e^mZ 


(16) 


Figure 7 represents the first three quantum orbits. The inner¬ 
most orbit has (in the case of hydrogen, where Z = 1) a radius 

a, = ■ = 0.527 X 10-* cm. 

4^2 e^m 
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This value turns out to be of the order of magnitude required for 
atomic dimensions by the kinetic theory of gases. The successive 
orbits have respective radii 4 times, 9 timers, and so on, times the 
radius of the first orbit. These orbits represent, so to speak, 
imaginary tracks upon which the electron is constrained to move, 
according to the Ihihr theory. 

rt=3 



Fis. 7 

To each of these privileged orbits there corresponds an energy 


Thus the energy of the atom can assume only certain discrete values 
En, called energy levels^ each of which corresponds to a different 
state of the atom {quantum state), characterized, according to 
Bohr, by the motion of the electron on one or another of the quan¬ 
tum orbits. From the expression (16) for an, we find that these 
energy levels are given by 


2t^ e^m 1 


(17) 


Bohr assumed further that the electron can pass, with a sudden 
jump {quantum jump), from one orbit (n) to another (n'). Bohr 
makes no assumption on the nature of that passage, except that it is 
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to obey the principle of conservation of energy. Therefore the 
atom must absorb or emit the energy corresponding to the difference 
between En and En' (absorb if the transition occurs from an internal 
to an external orbit, emit in the opposite case). This energy is 
ordinarily absorbed or emitted in the form of radiation. In the 
Bohr theory, the mechanism of emission and absorption is thus not 
at all the one of ordinary electromagnetic theory (based on the fact, 
governed by MaxwelFs equations, that an electron radiates if it 
does not move in uniform rectilinear motion). It is even explicitly 
being denied that this theory holds in the atomic domain, since 
otherwise emission would take place continuously during the motion 
of an electron on a quantum orbit. Hence the energy of the atom 
would continue to diminish gradually, a condition irreconcilable 
with the existence of discrete energy levels. Instead the emission 
occurs, so to speak, in spurts, every time an electron passes from one 
quantum orbit (n) to a lower-lying one (n'), and no assumption is 
made concerning the mechanism of this emission. The only 
assumption concerns the frequency of the emitted radiation. 
This hypothesis is suggested by the fact, already discovered by 
Einstein (cf. §8), that in various phenomena (the photoelectric 
effect, for example) the radiation of frequency v is assembled into 
quanta, each containing an energy hv. By identifying each of 
these quanta with a burst of radiation emitted in a quantum jump, 
we are led to assume that the quantity En — En^ of emitted energy 
and its frequency v (in sec“"0 a-re connected by the relation 


V == 


En — En' 


(18) 


which is still the fundamental relation of spectroscopy. Analogous 
relations hold for absorption (except for the interchange of En and 
En'). 

From (18) we immediately see why the frequencies emitted by 
the atom occur as differences of ^'spectral terms,” and at the same 
time we recognize the physical significance of the latter: each of the 
terms corresponds to a different energy level of the atom, and hence 
to a different quantum state. Thus if we divide (18) by c in order 
to go from p to P and if we set 


(19) 
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, = - -j, ( 20 ) 

which is of the same form as formula (11) for the Balmer and analo¬ 
gous series. Even more remarkably, when we insert for e, m, and h 
their numerical values obtained from other physical phenomena,^ 
the coefficient e^/ch^ of (20) becomes numerically equal 
(within the limits of experimental error) to the Rydberg constant 
R] in fact, we find that R = 109,710 cm""^ by this method, whereas 
experiment yields R = 109,677.58 cm“h The agreement becomes 
even better when we take the correction due to the motion of the 
nucleus into account (see Part II, §58) which brings the theoretical 
value (for hydrogen) to 109,650 cm.“‘, with a difference from the 
experimental value of about 0.03 percent. The Bohr theory thus 
accounts quite well for all the more than 50 lines of atomic hydrogen 
and hydrogenlike ions, yielding for the Rydberg constant the 
expression 

ft = (21) 

Subsequently the Bohr theory was amplified and perfected, 
especially by Sommerfeld. By considering elliptical as well as 
circular orbits and by introducing various important refinements, 
Sommerfeld was able to account not only for the minutest details of 
the hydrogenlike spectra but also to open the path for the study of 
more complex atoms. Up until the rise of quantum mechanics in 
the late 1920’s, this study developed very successfully on the basis 
of Sommerfeld^s methods. He postulated, in the place of Bohr^s 
condition (15) more general conditions able to characterize the 
quantum orbits even in cases which are more complex than that of 
Bohr {Sommerfeld conditions),^ These conditions, which (just like 

^ The calculations are made by taking for e, m, and h the best values avail¬ 
able in 1949 among those obtained by methods which do not make use of (21) 
(see R. T. Birge, Rev, Mod, Phys, IS, 233 (1941)]: e *= 4.8025 X 10“^® e.s.u. 
(from the faraday and Avogadro^s number), h *= 6.624 X 10~*^ erg-sec (from 
the fine-structure constant), m = 9.1066 X 10~*® g (from e/m), all with 
probable errors of the order of 0.03 per cent. 

* These were proposed independently and almost simultaneously by 
W. Wilson [Phil, Mag. 29, 796 (1915)], Ishiwara [Tokyo Math. Phys, Proc, 8,106 
(1916)1, and A. Sommerfeld [Ann. d. Physik 61, 1 (1916)], 
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Bohr’s, which is a particular case thereof) originally appeared to be 
rather strange a priori postulates, justifiable only a posteriori by 
the (not always complete) correctness of their consequences, can 
today be deduced as results of first approximation of the Schrodinger 
theory. We shall postpone the presentation of the Bohr-Sommer- 
feld theory from that standpoint until Part II, proceeding now to 
examine those fundamental ideas of this theory which retain all 
their value, even in the new atomic mechanics. 

16a. Determination of the Rydberg constant by means of the 
correspondence principle. Although the quantum laws, which are 
obeyed by electrons in atoms, differ from the classical” laws of 
ordinary rational mechanics and from Maxwell’s electromagnetic 
theory, it is clear that the difference must be the less relevant, the 
larger the mechanical system with which we are dealing, until it 
must become entirely negligible for systems of ordinary size. This 
idea has been precisely stated by Bohr in a postulate called corre¬ 
spondence principle, which has served as a valuable guide to find the 
laws of atomic mechanics, and which we shall illustrate in §63 of 
Part I. We shall anticipate this principle by a treatment in limited 
form but suflScient to apply it to the problem of the hydrogen atom, 
discussed in the preceding section. We shall find that the corre¬ 
spondence principle can replace the postulate of the angular momen¬ 
tum expressed by (15) and that it enables us to obtain, by a fairly 
direct way, the expression (21) for the Rydberg constant. 

First of all we recall that, according to classical electromagnetic 
theory, if the motion of an electric charge is periodic of frequency vq, 
the magnetic field generated by it will also be periodic, and hence 
each of its components can be considered to be a sum of functions, 
sinusoidal in time, of frequencies j'o, 2 j'o, 3 f'o, . . . (development in a 
Fourier series; see §9 of Part II). Therefore thej radiation will be 
composed of spectral lines having these frequencies. The first is 
called the fundamental frequency, and its multiples are called 
harmonics; of course the latter may be missing entirely or in part. 

We shall now calculate the frequencies which would be emitted, 
according to classical laws, by the electron of a hydrogen atom by 
virtue of its motion in a circle of radius r. The period, or the time 
required to complete the orbit, is 2irr/v; hence the fundamental 
frequency (which is the reciprocal of the period) will be, taking into 
account expression. (14) with Z = 1, 
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The frequencies which would be emitted according to the classical 
theory are thus given by® 


e 1 

Vcl = s —— — 
2t m 


(s = 1, 2, 3, . . 0. («) 


Let us now take up the problem from the quantum point of view, 
and let us therefore introduce the following two postulates: 


(a) Therecxist quantized orbits of radii r = an(n = 1,2,3, . . .)> 
which we shall not characterize here. Their energies, by virtue 
of (14'), are given by 


if!. 

2 dn 


(^) 


(b) The frequency emitted in the quantum jump from the 
?ith to the n'th level is given by the Bohr formula (18). 

For these frequencies to coincide with the experimental ones 
given by (11), we must have 


rj Rhc 


and hence, because of (13), 


a„ = 


2Rhc 


(t) 


This is the law which fixes the quantum orbits. The value of the 
constant R still has to be determined theoretically. 

(c) With an increase of n and n' (and hence of the dimensions of 
the orbits), the frequencies given by the quantum theory must tend 
to become identical with the ones of the classical theory: this is the 
correspondence principle which will permit us to find the theoretical 
value of R. 

Quantum theory yields, for the frequency emitted in the jump 
from the nth to the n'th orbit, the following expression (where we 
have set n' = n — s): 

* In the case of circular motion, which we are here considering, the intensity 
of the harmonics turns out to be zero. This fact has no importance in the 
reasoning that follows, which can thus also serve as a model for the more general 
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If we develop the binomial in a power series and indicate by dots 
the terms with higher powers of s/n, the formula becomes 

* 2Rcs ^-1 , 

(1 + . . 

On the other hand, the frequencies emitted according to the classical 
theory are given by (a), which, upon setting r = Un and using ( 7 )^ 
become 

[2 (Rhc)^ s 

\m TT n* 

The ratio of these two expressions is 



If we let n tend toward infinity (s being held fixed) the equation 
tends to the limit (since the terms indicated by dots approach zero): 



Now the correspondence principle requires that this limit be equal 
to unity; thus we find that R must be given by 


ch^ 


This expression for the Rydberg constant coincides with ( 21 ), but 
we have now found it without making use of condition (15) con¬ 
cerning the angular momentum; instead we have used the corre¬ 
spondence principle. 

We can then easily deduce condition (15) by observing that from 
(14) (with Z = 1 ) we have ilf = e Substituting for r the 

value ttn given by ( 7 ), we obtain 

From this formula, we get (16) when substituting the expression 
found for R, 
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17. Quantum states and energy levels. Although the atomic 
model proposed by Bohr was profoundly modified later on, the 
fundamental postulate of that theory, which is the existence of 
discrete energy levels, is today an experimentally established fact, 
not only for the hydrogen atom but for all atoms and molecules, as 
will be seen in the following sections. We should remember, 
therefore, that an atom or a mole-- 

cule is a mechanical system capable ^ (Ionization) 
of being in different states {quantum 
states) to each of which there corre¬ 
sponds a different value of the en¬ 
ergy. As an intuitive representa¬ 
tion of these states, we may (if we 
wish) think of them as correspond¬ 
ing to the motion of the atomic 
electrons along different quan¬ 
tum orbits” (generalization of the 
privileged circles in the Bohr 
theory). But such an interpreta¬ 
tion should no longer be regarded 
as realistic.'^ Whatever is said in 
the following sections is independ¬ 
ent of such a model, although it 
will often be referred to in order to 
help intuition and to render the 
language more expressive. 

The different energy levels of 
an atom (or of a molecule) are 
usually represented graphically by means of horizontal lines at 
various heights (see Fig. 8, which represents the levels of the 
hydrogen atom).' Since the energy is always negative, they lie 
always below the zero level, which corresponds to the ionization 
limit, toward which they normally converge. The lowest level 
(n = 1) is called ^‘ground level,” because it corresponds to the 
condition in which the atom is normally found {ground state); the 


-15 ^ 
VoUs 


(Ground level) 


Lymxuv 


Rs. 8 


^ We can regard such a model as analogous to the well-known hydrod 3 mamic 
models which aid in the comprehension of the fundamental laws of electricity, 
by which, for instance, an electric current is likened to the flow of a liquid 
through a tube. 
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other states are called excited states (in the Bohr model, the ground 
state corresponds to the smallest orbit). 

Another one of the fundamental ideas of Bohr, which still retains 
its full value, is that an atom can pass from one state to another by 
absorbing or emitting the energy difference which—when it is in the 
form of radiation—has a frequency given by law (18). In this 
manner we interpret the existence of spectral terms'’ for any atom 
or molecule; these correspond to the energy levels by reason of (19), 
and therefore they permit the energy level scheme to be constructed 
directly from spectroscopic data alone. In the diagram it is custom¬ 
ary to indicate by vertical arrows the transitions that give rise to 
the observed spectral lines. Their frequency is thus proportional 
to the length of the arrow. 

We can say that the principal purpose of atomic mechanics is to 
lead to a quantitative prediction of the energy levels of the atoms, 
which today is done by criteria different from those of Bohr. We 
shall postpone these considerations until later on, when we shall 
point out those other experimental facts which confirm the existence 
of discrete quantum states and the exactness of law (18) in regard to 
transitions between them, with the emission or absorption of 
radiation. 

18. Excitation and ionization by electronic impact. An atom 
may acquire energy not only by absorption of radiation but also by 
collision with another atom or an electron. In terms of the Bohr 
model, an atomic electron, upon being hit by another particle, may 
pass from the ground orbit to a more external one {excitation) y 
removing the necessary energy from the kinetic energy of the imping¬ 
ing particle.^ If the impact is sufficiently violent, the electron may 
be entirely removed from the atom, thus giving rise to the ionization 
of the latter: ionization is thus a limiting case of excitation. 

The experiments described in the succeeding sections are con¬ 
cerned precisely with the excitation of atoms by electron impact; 
therefore we shall anticipate a few general considerations about that 
phenomenon. 

® Excitation by collision plays a fundamental part in the ordinary temper¬ 
ature emission of radiation. In fact, the latter radiation occurs in the fol¬ 
lowing manner: the collisions due to thermal agitation excite a few of the atoms 
or molecules (at the expense of kinetic energy or of thermal energy); these 
atoms (or molecules), when returning to the ground state, emit the energy 
received in the form of radiation. 
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First of all we observe that, in view of the very large mass an 
atom has compared with an electron, the kinetic energy an atom can 
receive in a collision with an electron is entirely negligible, and we 
may proceed on the assumption that the atom represents a fixed 
obstacle for the electron. The (uiergy that is communicated to the 
atom by the collision serves entirely to increase its internal energy, 
that is, to induce a transition from the ground state to an excited 
state. This being understood, the collision may have various 
effects according to the energy of the colliding electron. 

First of all, it is clear that the collision can produce no effect 
whatsoeve^r, if the energy of the colliding electron is below that 
required to raise the atom into the first of the excited states, that is 
to say, if it is less than — Eu which limit is called resonance 
energy for reasons tliat will become apparent in what follows. In 
such a case, f herefore, the atom remains unaltered, and the electron 
goes away as if it had collided with a perfectly clastic body; for this 
reason, such collisions are called elastic collisions. 

If, on the other hand, the kinetic energy of the colliding electron 
is even slightly higlicT than the resonance energy, it may happen 
that the colliding atom becomes excited, and hence that the electron 
departs with a lower kinetic energy exactly equal to the excess of 
the initial kinetic energy al)ove the resonance energy — Ei. It 
is as if the electron had collided with a soft body: such collisions are 
called inelastic. If the initial kinetic energy is even greater than 
E^ — El, then there may occur, in addition, inelastic impacts that 
transport the atom into the energy state E^ and hence remove from 
the electron an amount of energy — Ei {second excitation energy),^ 
and so forth. Finally, if the colliding electron has an energy above 
— {ionization energy), inelastic collisions will occur, resulting in 
the ionization of the atom, and hence in the appearance of a positive 
ion, in addition to two free electrons. 

When an atom becomes excited, it will generallj^ not remain in 
that state but after a very short time (which from experiments of 
Wien turns out to be of the order of 10“^ sec) will return to the 
ground state, emitting the residual energy in the form of radiation, 
with a frequency given by Bohr^s formula (18). Thus we can see 

® Each atom evidently has a series of excitation energies, the first of which 
is called ^Tesonance energy”; they converge to an upper limit, which is the 
ionization energy. 
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that when atoms are bombarded by electrons of energy sufficient to 
excite them, they must emit radiation. More exactly, if the kinetic 
energy of the electrons lies above — Ei (but not above Ez — £'i), 
the atoms can be excited only to the level E 2 and hence can only 
emit radiation of frequency {E^ — Ei)/h\ the emitted radiation, 
when analyzed with a spectroscope, will thus yield a single line. If 
instead the kinetic energy of the electrons exceeds Ez — Ei^ some 
atoms will be excited to state E 2 , others to state Ez, and hence 
the two lines of frequency {E^ — Ei)/h and {Ez — Ei)/h can be 
emitted; sometimes it may even happen that an atom excited to 
state Ez will return to the ground state in two steps, that is, first 
by passing to the state E 2 and from there to state Ei, In this event, 
we shall see the emission of a line with frequency {Ez — E 2 )/h 
during the first of these transitions. Thus, if we continue to increase 
the energy of the electrons, we observe the emission of a more and 
more complete spectrum. We therefore have, by spectroscopic 
observation, another method of checking the phenomena of excita¬ 
tion by collision. 

In the succeeding sections we shall show briefly how these 
phenomena have been verified experimentally in a great number of 
investigations, among which are the classic ones performed by 
Franck and Hertz in 1913-1914.^® The fundamental principle of 
these experiments goes back to an even earlier period, since, starting 
in 1902, Lenard had applied it in a remarkable series of researches 
on the ionization by collision. 

Before describing these experiments, we should like to make a general 
observation in order to explain a few concepts that have become customary 
in this field of study. The arrangement generally used to accelerate 
electrons to a given velocity consists (see Fig. 9) of a metallic filament, 
made red-hot by passing a current, thus emitting electrons of low velocity 
because of the thermionic effect. In front of the filament, a thin metallic 
net, called a gridj is maintained (by means of a battery and a potentiometer) 
at a positive potential with respect to the filament, so that the electrons 
emitted by the latter find themselves in an electric field which accelerates 
them toward the grid. They then pass through its meshes and continue 
beyond because of their inertia. Thus, if 7 is the potential difference 
between the filament and the grid, and if we assume that the velocity with 
which the electrons are expelled from the filament is negligible, we see, 
by applying the theorem of kinetic energy, that they reach the grid with 
a kinetic energy equal to eV, and hence with a velocity 

10 Verh, d, D. Phys. Ges, 16, 12 (1914). 


f 
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Thus, by varying V by means of the potentiometer, we can regulate the 
velocity v at will. 

This procedure has given rise to the practice of characterizing the 
kinetic energy and the velocity of the electrons by indicating directly the 
potential V, expressed in volts, necessary to produce that energy, and to 
call this potential the energy expressed in volts or, even more 
improperly, the ‘‘velocity expressed in volts.^^^^ For instance, an electron 
is said to have a velocity of 1 volt, if it has that velocity which it would 
acquire by falling through a potential difference of 1 volt, that is, a velocity 
of 5.932 X 10^ cm/sec. We observe that the velocity is really not pro¬ 
portional to the “velocity expressed in volts’' but to its square root [see 
equation (22)]. 

For the same reason, it has become an established practice to designate 
the energy of excitation or of ionization of atoms by indicating the potential 
difference, in volts, necessary to impart that energy to electrons. One 
thus speaks of “resonance potential,” “excitation potential,” and “ioniza¬ 
tion potential”; and energy levels are also often expressed in volts. Given 
one of these potentials, V, we obtain the corresponding energy E by the 
obvious relation 

E = cF, (22') 

where e is the charge of the electron in absolute value. From this equation 
we find that an “energy of 1 volt” is equivalent to 1.602 X 

Usually the resonance, excitation, and ionization potentials of the 
atoms are of the order of a few volts, rarely above 20 volts. 

Finally we note that to each excitation or ionization potential there 
corresponds a well-determined frequency (according to Einstein’s relation 
between energy and frequency), given by 


V — 


E 

h 


eV 

h' 


(23) 


which represents the* frequency of a light quantum having the same energy 
as an electron which has fallen through a potential difference V. If then v 
is expressed in cm~^ (which will be indicated by v as usual), equation (23) 
becomes 


P 


£F 

he 


(23') 


Frequently the energy levels of atoms are simply expressed in cm“* 
according to (23'); 1 volt corresponds to 8067.9 cm“^ 

Often we say “in electron-volts.” 
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If, on the other hand, the radiation is characterized by the wave¬ 
length X, the relation (23') must be replaced by 

X = (24) 

From (24), upon substituting the numerical values, we obtain the following 
easily remembered relation between X and V, where X is expressed in 
angstrom units and V is expressed in volts: 

XF = 12,395. (24') 

19. Experiments of Franck and Hertz. The experiments of 
Franck and Hertz consist in the excitation of atoms by electronic 
impact and in the indirect measurement of the kinetic energy lost 



Fig. 9 

by the electrons in the collisions, or else of the energy imparted to 
the atom when it is being excited. In this fashion we obtain the 
heights of the various energy levels above the ground level, or else 
the ^‘excitation energies’' E 2 — Ei, Ez — Ei, and so on. Com¬ 
parison of the values found with the values obtained from spectral 
terms constitutes a verification of the fundamental Bohr hypothesis 
expressed by (18). 

The arrangement of Franck and Hertz is schematically the 
following (Fig. 9). In a glass tube containing the substance 
under study in a rather rarefied gas or vapor state, the system for 
the acceleration of the electrons is located. As has been stated in 
the preceding section, this system is composed of an incandescent 
filament F and of a grid Gj maintained at a positive potential with 
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respect to the filament F by a battery. A potentiometer permits 
this potential difference to be regulated, and a voltmeter (not 
shown in the figure) indicates it exactly. The field existing between 
F and G accelerates the electrons, which rush toward the metallic 
net and partly traverse it. On the other side of the grid, at a dis¬ 
tance of about 1 mm, there is a plate P, maintained at a potential 
slightly below that of the grid, so that the electrons are slightly 
slowed down in that space. Having reached the plate P, they 
return to the filament through a galvanometer A. 

If the electrons suffer no inelastic collisions, they will move with 
accelerated motion up to the grid, where they will attain their 
maximum velocity. This maximum corresponds to a kinetic 
energy equal to cV (if V is the potential difference between filament 
and grid). This kinetic energy is more than enough to make the 
electrons overcome the weak opposing field existing between grid 
and plate, so that those which pass between the meshes of the grid 
all arrive at the plate and are registered by the galvanometer. 

But now let us suppose that we increase the potential V until 
eV is slightly larger than E 2 — Eii the kinetic energy of the elec¬ 
trons in the neighborhood of the grid will then be sufficient to pro¬ 
duce inelastic impacts; some of the electrons will lose almost all 
their kinetic energy in a collision and will no longer be able to over¬ 
come the opposing field and to reach the plate. Hence the galva¬ 
nometer, as soon as eF exceeds the limit E^ — Eiy will show a sudden 
decrease in current. Reading the potential, F', for which this dis¬ 
continuity occurs, we can calculate the first excitation energy (or 
resonance energy) by the formula 

E 2 - El ^ eV\ 

This F' is thus the potential we have called resonance potential. 

If the potential is increased further, the current also increases, 
until a new sudden diminution occurs for a value F" nearly twice 
that of F'. Indeed, when F is raised, it follows that the inelastic 
collisions, rather than being produced only in the vicinity of the grid, 
are already produced ahead of it (that is, closer to the filament.) 
The more F is raised, the more the region recedes in which such 
collisions occur, so that the electrons which have lost all their 
velocity in a collision can still be accelerated by the field before 
arriving at the grid; when F reaches 2(^2 — Pi), some of these 



52 HISTORICAL DEVELOPMENT AND EXPERIMENTAL BASES [§19 

electrons, after the first inelastic impact, reacquire in the field an 
energy E 2 — Ei and hence undergo a second inelastic collision near 
the grid, in which they once again lose their kinetic energy and thus 
remain incapable of reaching the plate. Similarly it is evident that 
there will be another dip in the current for V = 3(^2 — Ei)y and so 
on. In fact, the experimental curves exhibit the behavior repre¬ 
sented by Fig. 10, which applies to mercury vapor (Einsporn). 



Fis. 10 

We may take advantage of this fact by measuring with great 
accuracy the resonance potential F', obtaining it from the distance 
between two consecutive maxima of the curve, rather than from the 
position of the first maximum; in this manner we eliminate those 
causes of error (contact potentials, for example) which produce a 
displacement of the curve as a whole or which perturb its first part. 

Thus far we have reasoned as if the atom could absorb only the 
energy E 2 — Ei, thus omitting the possible transitions from the 
ground level Ei to the levels £ 3 , £^ 4 , and so on. This omission is 
permissible in the many cases where collisions are so frequent that 
almost all electrons, as soon as they have reached an energy suffi¬ 
cient to carry an atom to the level £ 2 , lose this energy in a collision 
and thus are no longer able to produce an excitation requiring a 
higher energy. 

However, by varying the conditions of the experiment, this effect 
can be prevented (it is necessary to decrease the pressure of the gas); 
then different series of maxima occur in the curve, each series cor¬ 
responding to an excitation level. 

Finally, we point out that the same apparatus permits the direct 
determination of the ionization potential: all we need do is to 
establish a considerable and fixed potential difference between grid 
and plate, larger than any potential used to accelerate the electrons 
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(for example, 40 volts), instead of a potential difference small com¬ 
pared with the potential to be determined. None of the electrons 
projected from the filament will then be able to reach the plate P, 
since they will all be driven back by the field existing between G and 
P, But if some of the colliding atoms become ionized in the vicinity 
of the grid, they will remain positively charged and thus will rush 
toward the plate and be registered by the galvanometer. It will 
then suffice to increase the accelerating potential of the electrons 
gradually, until the galvanometer begins to read a current; this 
potential Vi will be the one necessary to ionize the gas, which 
amounts to saying that 

eVi - 

This cursory description of these experiments will be sufficient 
to give an understanding of their fundamental concept and to show 
the possibility of determining the various energy levels by electrical 
means. We cannot describe here all the particular techniques 
and the many variations of these experiments which were per¬ 
formed on various substances, not only by Franck and Hertz but 
by many other physicists; for these data the reader is referred to 
No. 28 of the Bibliography. 

20. Optical verification of the excitation by collision. In the 

experiments described in the preceding section, the energy absorbed 
by the colliding atom is determined indirectly. A more direct 
method of measuring it consists, as has been mentioned, in observing 
the radiation emitted by the atom when it is returning to its ground 
state. 

The apparatus consists essentially of a tube of quartz glass con¬ 
taining the usual incandescent filament and grid, between which 
there is maintained a known accelerating voltage. Then an image 
of the region where the collisions occur is projected upon the slit of 
a spectroscope, and the investigator determines those values of the 
voltage for which the various lines of the spectrum appear. This 
method was used with mercury vapor by Franck and Hertz in 1914. 
They found that the line X = 2536.6 A {resonance line) of that gas 
did not occur so long as the potential difference was below 4 volts 
but that it already was susceptible of being photographed at 5 
volts. More precise measurements have yielded 4.9 volts for the 
resonance potential of mercury. From the wavelength of the 
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resonance line, we find, by means of formula (24'), that the potential 
necessary to excite this line should be 4.87 volts, in perfect agree¬ 
ment with the value found experimentally. 

Analogous experiments have been repeated by many other 
observers and on many dilTerent substances, alwa 3 ^s with results in 
full accord with the theory (see No. 28 of the Bibliography). 

It should be noted that very often the radiations obtained by 
electronic impact lie in the ultraviolet region. In that case it is 
possible to detect them not only photographically but also by mak¬ 
ing use of their photoelectric effect. If we gradually increase the 
accelerating voltage of the electrons, following simultaneously the 
variations of the photoelectric current, and if we represent this 
dependence graphically, we observe that the curve suffers sudden 
deflections corresponding to the excitation potentials, siiuu^ to each 
of these potentials there corresponds the emission of new lines and 
hence an increase in the photoelectric effect. This was the method 
used by Horton and Davies, Franck and Knipping, Foote and 
Mohler, and others. 

All these experiments, while establishing a strict connection 
between optical phenomena and the phenomena of electronic colli¬ 
sions, constitute a direct proof of the physical reality of energy 
levels, as well as of the validity of the Bohr formula relating the 
emitted frequency to the energy difference of two quantum states. 

21. Collisions of the second kind. As we have seen, in a collision 
between an electron and an atom it may happen that the electron 
gives up part of its kinetic energy to the atom in the form of excita¬ 
tion energy. Klein and Rosseland have observed that in addition 
to this type of collision, said to be of the first kind, there must exist 
the possibility of collisions representing the inverse phenomenon; 
that is to say, an already excited atom colliding with an electron 
gets rid of its excitation, not by radiating but by giving its excitation 
energy to the electron in the form of kinetic energy. Hence the 
electron recoils with a velocity higher than it had before. The 
necessity for the existence of these collisions, called collisions of the 
second kind, has been demonstrated by Klein and Rosseland by 
means of the following thermodynamic argument. 

Let us suppose that a gas formed of atoms and electrons is 
enclosed in a cavity with walls that are perfectly reflecting for atoms 
as well as for electrons and radiation, so that no energy exchanges 
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with the outside are possible. As a consequence of the collisions 
of the first kind, the electrons will gradually give up their kinetic 
energy to the atoms. It there were no collisions of the second kind, 
the average kinetic energy of the electrons would continually dimin¬ 
ish with respect to the energy of the atoms. In this way two mixed 
gases would result, which, rather than tending toward a tempera¬ 
ture equilibrium, would get further and further away from it; this 
condition would contradict the second law of thermodynamics. 

The existence of collisions of the second kind is thus a thermo¬ 
dynamic consequence of the experimentally observed collisions of 
the first kind. 

The collisions of the second kind also have macroscopic effects 
of great practical importance. In fact, it is by virtue of these 
collisions that a gas traversed by radiation heats up, since the 
radiation absorbed by the molecules excites them and then is con¬ 
verted into molecular kinetic energy (that is, into heat) by means of 
the collisions of the second kind. 

22. Optical resonance. The scheme of the energy levels lends 
itself to the interpretation of the phenomena of optical resonance 
and of fluorescence, as well as those of ordinary emission and absorp¬ 
tion, However, as we shall see, the phenomenon of so-called optical 
resonance can be interpreted equally well by the classical scheme. 

This phenomenon, discovered by Wood, consists of the following: 
if a metallic vapor is irradiated with light of a wavelength exactly 
corresponding to one of its definite absorption lines, that light is not 
genuinely “absorbed'^ by the gas (that is, transformed into heat) 
but is scattered almost entirely in all directions. Characteristic of 
this phenomenon are the surprisingly large intensity scattered even 
by a very rarefied vapor, and the almost absolute monochromaticity 
of the scattered light. 

The experiment, performed for the first time with mercury vapor, 
is done in the following manner. Into a large, well-evacuated 
quartz bulb is placed a drop of mercury, so that the bulb contains 
mercury vapor saturated at ordinary temperatures, and thus of very 
low density. If this bulb is exposed to the rays of a quartz-walled 
mercury lamp, which strongly emits the ultraviolet line of wave¬ 
length X = 2536.6 A, it will be found that the bulb intensely re-emits 
that light, which can be photographically detected. Spectrographic 
analysis of the light of the lamp after it has traversed the bulb 
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shows that the line X2536.6 has been strongly absorbed. The 
phenomenon does not take place if the incident light has a wave¬ 
length ever so slightly different from this value, and accordingly 
such a line is called a resonance line. 

The phenomenon can also be observed with a bulb filled with 
sodium vapor and illuminated by the yellow D line of sodium (or 
else by one of its components, X = 5890 A and 5896 A). In this 
case there is the advantage that the light is visible without photo¬ 
graphy, and the bulb appears brightly radiant to the eye; however, 
it is necessary to heat the bulb, because the density of saturated 
sodium vapor at ordinary temperatures is too small. 

The classical interpretation of resonance, from which the name 
of the phenomenon is derived, is as follows. The atoms of the 
vapor contain oscillators having frequencies equal to those cont ained 
in the emitted light. If we illuminate the atoms with light of one 
of these frequencies, the corresponding oscillators, finding them¬ 
selves excited by an alternating electric field with a frequency equal 
to their own, will go into vibration of rather large amplitude, 
because of the phenomenon of ‘^mechanical resonance,^’ well known 
also in acoustics; hence the oscillators will emit intense light in all 
directions. 

In contrast, the quantum theory interprets the phenomenon of 
resonance as follows. The atoms finding themselves in the ground 
state can pass to a higher level, not only under the action of an 
electron collision but also under the action of incident radiation, 
that is, of photons; provided, however, that each of the photons 
contains exactly the energy necessary for the excitation of the atom. 
Thus, if the atom is struck by light of frequency 



it can go from state 1 to state 2. However, it normally does not 
remain in the second state, as we have already pointed out, but 
returns to the ground state by performing the inverse jump, and 
hence emits the absorbed energy in the form of radiation of precisely 
the same frequency. 

This theory makes possible the explanation of many peculiarities 
of the resonance phenomenon which remain unexplained in the 
classical theory. 
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23. Fluorescence. A phenomenon related to resonance but con¬ 
siderably more common and well known for a long time, is that of 
fluorescence. In fluorescence certain substances, when illuminated, 
re-emit light in all directions, of spectral composition different from 
that of the incident light. For instance, a solution of fluorspar, of 
extract from the bark of the horse chestnut (aesculin), or of sulfate 
of quinine, when placed in a glass container and illuminated with 
white light, appears (when viewed laterally) radiant with azure or 
greenish light. A chlorophyll solution, on the other hand, presents 
a red fluorescence. 

The phenomenon of fluorescence generally obeys the following 
law, discovered empirically by Stokes, whose name it carries: 
fluorescent light does not contain frequencies above that of the exciting 
light {which we assume to be monochromatic). 

This phenomenon may be explained by the quantum theory in 
the following manner. As we have said, the atoms become excited 
by the incident light; if they are carried from the ground level Ei 
to the one immediately above, we are dealing with resonance, of 
which we have already spoken. If instead they are carried into 
a higher energy level, such as Ez, they may then, in order to return 
to the ground state, perform two (or more) successive quantum 
jumps, in each of which they emit light of a different frequency. 
For instance, from the level Ez they would go to the level E^ with 
the emission of the line of frequency {Ez — E^/h, thence passing 
from E^ to Ex with emission of frequency {E^ — Ex) /h. Similarly, 
if the excitation had carried the atom to an even higher level, it 
could descend by different successive jumps. Hence it can be 
understood how a single exciting frequency may give rise to the 
emission of different frequencies. In this way Stokes’s law is 
immediately justified, since it is evident that all the quantum jumps 
performed during the emission phase correspond to level differences 
smaller than that corresponding to the excitation, and hence to 
frequencies lower than that of the exciting light. 

24. Sensitized fluorescence. A significant confirmation of the 
quantum theory of fluorescence is furnished by the following experi¬ 
ment, performed for the first time by Cario^^ and further studied by 
Franck and Cario and by others. A quartz tube, at rather elevated 
temperatures, contains a mixture of mercury vapor and the vapor 

« Zeits, /. Physik 10, 185 (1922). 
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of another metal, thallium for example. The tube is illuminated 
with light of wavelength 2536.6 A emitted by a mercuiy arc, and 
the experimenter observes the laterally scattered light with a spec¬ 
troscope. He finds, in addition to the resonance? line of the mer¬ 
cury, a certain number of lines characteristic of thallium. How¬ 
ever, these do not appear if he illuminates some thallium vapor 
unmixed with mercury vapor, or if the incident light has a frecjuency 
different from the resonance line of mercury. 

This phenomenon, called sensitized fluorescence or indirect 
fluorescence, forces us to think that the state of excitation induced by 
the light in one of the gases may be transmitted to the other gas, 
and is interpreted as follows. Many nu^rcury atoms absorb the 
incident radiation and go to an excited state. If they happen to 
collide with a thallium atom while they are still in that state, they 
may get rid of their excitation energy by transfcj’ring it to a thallium 
atom and thus carrying it into an excited stale. The thallium 
atom in turn, returning in one or more jumps to the normal state, 
emits its characteristic lines. 

Of course, the energy furnished by the ex(*ited mercury atom is 
not equal to the energy necessary to excite the llialliurn atom. It 
is normally superior to the latter, and the surplus energy is imparted 
to the thallium atom in the form of kinetic energy. 

25. Note on the development of spectroscopy. The spinning 
electron. In general it is not possible to calculate quantitatively 
the energy levels of an atom with several electrons by the Bohr- 
Sommerfeld theory, as has been possible in the case of hydrogen. 
In fact, already in the case of only two electrons (helium atom) the 
problem presents serious mathematical difficulties. It has been 
treated with approximations, and we shall speak of the results of 
these laborious calculations shortly. In the more complex cases 
the mathematical difficulties become practically insurmountable. 
Nevertheless it has been possible to attack the problem, at least 
qualitatively, thanks to a simplifying hypothesis proposed by Bohr, 
which has proved to be in good agreement with reality. This 
h 3 q)othesis consists in the assumption that the quantum jumps 
which give rise to radiation generally involve only one electron—the 
most easily excited electron or else the electron whose normal 
energy level lies highest. This electron is called the emission elec¬ 
tron; it is assumed that the others continue to describe their orbits 
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around the nucleus undisturbed, forming with the nucleus an 
invariant structure called the core or the remainder of the atom. 
Thus the problem is reduced to a study of the energy of the quantum 
states of the emission electron; it differs from the analogous problem 
for hydrogen, inasmuch as the emission electron is subject to the 
force exerted by the whole remainder of the atom, not only by the 
nucleus. This force is of course quite complex and varies in time. 
In general we must limit ourselves to a rough evaluation by sub¬ 
stituting for it a time-independent central field. The problem is 
thus reduced to one of central motion under the action of a non- 
Newtonian force. Solving it and then applying the three Sommer- 
feld conditions (the system having three degrees of freedom), we 
succeed in finding for the emission electron a series of energy levels 
(depending upon three quantum numbers^*) which are in general in 
good qualitative agreement with observed values. However, we 
are far from being able to attain quantitatively exact results. 

Such considerations have served as the basis for an enormous 
amount of interpretation and organization of experimental results 
in the field of spectroscopy. This work consists essentially in 
assigning to each spectral term a group of two quantum numbers n 
and 1. This assignment retains its value independently of the 
mechanical model of Bohr and Sommerfeld, and it still constitutes 
the key for the interpretation of spectra. In fact, a series of rules, 
discovered partly empirically, partly by theoretical considerations 
(and today for the largest part justified l)y quantum mechanics), 
enables us to deduce many of the (diaracteristics of a line, such as 
intensity, Zeeman effect. Stark effect, and others, by using the 
quantum numbers corresponding to the two spectral terms giving 
rise to the line.^^ 

But from this .search for quantum numbers corresponding to 
each energy level there arose the necessity for a new hypothesis. 
It was found that three quantum numbers (or rather two in the 
absence of a magnetic field) are insufficient to define an energy level, 
and a satisfactory scheme was obtained only when a fourth quantum 
number, called inner quantum number^ was introduced, which, how- 

They are called: principal quantum number n, azimuthal quantum number /, 
magnetic qixantum number m. The last does not influence the energy, except 
in the case of the Zeeman effect; normally, however, the terms depend only 
on the first two. 

See, for instance, No. 23 of the Bibliography. 
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ever, could only assume two values in each case. There remained 
the problem of its mechanical interpretation. After a hypothesis 
of Lande had proved insufficient for the purpose, a hypothesis 
stated in 1925 by Uhlenbeck and Goudsmit,^^ and independently 
by Bichowsky and Urey,^® proved to be entirely satisfactory. It 
has been of decisive importance in the further development of 
atomic physics, proving to be an ever-growing source of results 
agreeing with experience. This hypothesis consists in attributing 
to the electron, in addition to electric charge and mass, a magnetic 
moment of magnitude equal to eh/^iirmc (that is, to a Bohr magneton; 
see §61, Part II), and a (mechanical) angular momentum having the 
same direction and having a value /i/47r, as if we were dealing with 
a small top whose axis was magnetized. At one time it was thought 
that this property could be interpreted by imagining the electron as 
a charged sphere rotating about an axis: the rotation of the mass was 
to have produced the mechanical angular momentum; whereas the 
rotation of the electric charge, being equivalent to a system of 
circular currents, was to have given rise to the magnetic moment. 
This model had soon to be abandoned, but nevertheless the hypo¬ 
thesis of Uhlenbeck and Goudsmit has retained the improper name 
hypothesis of the spinning electron, and the angular momentum of an 
electron is today designated by the name of spin. The electron has 
thus lost its spherical symmetry by acquiring a privileged axis, 
namely, the axis of its spin. 

Applying to the angular momentum of the electron a condition 
analogous to those of Sommerfeld we are led to assume that when 
the electron finds itself in a magnetic field, it can take on only two 
orientations, namely, with its spin parallel or antiparallel to the 
magnetic field, to each of which there corresponds a different value 
of the energy. The two values of the inner quantum number are 
used merely to distinguish between the two energy levels correspond¬ 
ing to these different possibilities of orientation. 

The hypothesis of the spinning electron was later shown to be 
very adaptable to the interpretation of the magnetic properties of 
metals; in fact, some phenomena of this class (gyromagnetic effects) 
have made it possible to measure the ratio of the magnetic moment 
to the angular momentum of the electron, which yielded results in 
agreement with the values given above. 

« Nature 117, 264 (1926); Phyaica 6, 266 (1925). 

Proc, Nat. Acad. Sci. 12, 80 (1926). 



§ 26 ] 


ENERGY LEVELS 


61 


Finally we shall mention the fact that the methods of inter¬ 
pretation of atomic spectra have been successfully extended to the 
spectra emitted by molecules {band spectra), many of which, espe¬ 
cially in diatomic molecules, have been pericctly interpreted by 
means of the same theoretical principles. We shall, however, con¬ 
fine ourselves here to the consideration of the mechanics of atoms, 
referring to other volumes for the molecules. 

26. The crisis of the modelistic theories. Although the Bohr- 
Sommcrfeld theory served as a guide for the theoretical and experi¬ 
mental investigations mentioned in the preceding sections and for 
many others, it nevertheless rests upon postulates which are so 
unsatisfactory (mainly because of the arbitrary introduction of 
discontinuity) that it is no longer considered to be a definitive 
expression of a physical theory. Instead it is viewed as a provi¬ 
sional codification, so to speak, of the changes that must be made 
in classical mechanics and electromagnetic theory in order to make 
them applicable to the atomic domain. Furthermore, although the 
general concept of energy level and the Bohr formula (18) concern¬ 
ing frequencies have proved in all cases to be in such close agree¬ 
ment with experimental results as to make us feel certain that they 
represent actual physical reality, the experimental proofs of the 
exactness of the method of determination of levels by calculation 
of the mechanical orbits and the successive application of the 
Sommerfeld conditions are considerably less solid. In fact, the 
only case in which this method leads to quantitatively exact results 
IS the one of hydrogenlike systems. In the case of helium and in 
the few other cases in which the method has been applied (the 
serious mathematical difficulties being overcome by approxima¬ 
tions), energy levels have been found in open contradiction with 
experiment. Later dt was found in certain cases (for example, in 
the prediction of the anomalous Zeeman effect) that the formulas 
deduced from the Bohr-Sommerfeld theory require slight modifica¬ 
tions, such as the substitution of l{l + 1) for P, where I is a quantum 
number. These modifications were determined empirically without 
any justification for them being found within the realm of the 
Bohr-Sommerfeld theory. 

This theory then encountered other serious difficulties in its 
application to optical dispersion, to certain features of collisions 
between electrons and atoms (Ramsauer effect), and to various 
other questions. 
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While these difficulties accumulated in the field of atomic 
mechanics, the situation in the field of the theory of light appeared 
no less serious: as we have mentioned, there existed two models, 
one undulatory and one corpuscular, each of which permitted the 
exact interpretation of one category of phenomena, but was incom¬ 
patible with the other. 

The various attempts to substitute more coherent representa¬ 
tions for the atomic >model of Bohr and Sommerfeld and for the two 
models for light were all doomed to failure, so that the situation of 
theoretical phj^sics around 1925 was characterized by a sense of 
discomfort and by the feeling that a profound revision of the 
fundamental principles of theoretical physics was necessary. 

The new logical form which these principles were to take 
appeared clearly only in the following year, when the new quantum 
mechanics had already achieved brilliant successes on a purely 
formal basis. But already in 1925 several of the major exponents 
of theoretical physics had formed the conviction that this critical 
situation had arisen from the presupposition, inherent in all previous 
theories, that the atoms, photons, and other entities of the atomic 
domain must be thought of in terms of extremely small bodies or 
mechanisms, even though subject to laws entirely different from 
those governing ordinary bodies. They may, however, be essen¬ 
tially different entities, to which our intuitive concepts of body, 
motion, and so on, do not apply (indeed, there is no reason why 
they should apply). In other words, we must consider the possi¬ 
bility that the atomic world, though governed by mathematically 
expressible laws, is not representable by any intuitive model 
(see §34). 

The new direction which theoretical physics came to take after 
1925 as a consequence of these new ideas is usually designated by 
the name of quantum mechanics ,It has been developed in differ¬ 
ent forms, of which we shall give a purely historical and informative 
account in the following chapter, reserving for other parts of the 
book the systematic presentation of their essential features. 

This expression is not to be confused with the nowadays rather generic 
one of quantum theory^ which embraces all the theories in which Planck’s 
constant h has an essential part. Quantum theory thus includes the Bohr- 
Sommerfeld theory (now sometimes called classical quantum theory)^ as well as 
quantum mechanics in its various forms (wave mechanics^ matrix representcUiony 
and operator representation). 



CHAPTER 4 
Quantum Mechanics 

27. The quantum mechanics of Heisenberg (matrix method). 

The new approach mentioned at the end of the preceding chapter 
was inaugurated by W. Heisenberg in a paper^ published in July 
1925. The fundamental idea advanced in that work is that some 
of the quantities inherent in the atomic model used in the quantum 
theory (such as the coordinates of an electron within an atom at a 
given instant or the duration of an orbital revolution) are no longer 
being measured directly. Hence, in view of the fact that reasoning 
based upon these quantities leads to known difficulties, it is legiti¬ 
mate to doubt their actual physical significance and the possibility 
of their future measurability. Other quantities, however (such as 
emitted frequencies, intensities, and so on), are directly observable. 
Hence instead of looking for a geometrical and mechanical model 
that will make it possible to obtain values of observable quantities 
from a nonobservable structure, it is better to try to relate the 
observable quantities directly, without the intervention of any 
model. 

However, the direct relations between observables are in general 
not expressible by ordinary algebra, and hence the further develop¬ 
ment of Heisenberg’s idea led to the utilization of matrix algebra^ a 
mathematical process which had been known for some time but 
which had not yet* found any applications in the field of physics. 
This method was greatly developed, mainly by Heisenberg, Born, 
and Jordan. In many cases it led once again to the results of the 
Bohr-Sommerfeld theory, whereas in others it led to results which 
are in even better accord with experience. 

The importance of the progress achieved by Heisenberg is not 
limited to these considerations. He was also able to deduce all the 
results, by a unified method, from the same organic system of 
postulates, and hence to substitute, for a theory based on partially 

i ZeitB, f. Physik 88 , 879 (1926). 
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contradictory foundations, a theory perfectly coherent from a 
logical point of view. However, the matrix method contains 
within itself, along with these merits, the serious inconvenience of 
being rather difficult to comprehend and of not satisfying the needs 
of intuition. These objections arise from the use of such an 
unusual and somewhat complex mathematical apparatus as matrix 
algebra, but above all from the required renunciation of any 
geometrical or mechanical model. This renunciation proves to be 
necessary in order to be able to formulate in a precise and coherent 
form the laws of the atomic world. 

The intimate reason for the impossibility of basing atomic 
mechanics on a mechanical model without sacrificing logical 
coherence or precision has been elucidated in a later paper of 
Heisenberg entitled ^^tJber den anschaulichen Inhalt der quan- 
tentheoretischen Kinematik und Mechanik'^ ('^On the Intuitive 
Content of Quantum-theoretical Kinematics and Mechanics’O,^ in 
which the so-called ^^uncertainty principleis established. This 
principle, which may be termed the key to all atomic mechanics, 
has made it possible to place quantum mechanics in its true light. 
We shall return to this principle and the consequent impossibility 
of a rigorous atomic model in greater detail in what follows. 

28. Wave mechanics. Almost simultaneously with the matrix 
method, another method for treating atomic problems arose and 
was developed. This theory, called wave mechanics^ was originally 
suggested by L. de Broglie.* Later it was developed and placed in 
a new perspective by E. Schrodinger, who outlined its funda¬ 
mentals in a series of lucid papers published from February 1926 on.^ 
The Schrodinger method starts with the observation (which 
goes back to Hamilton) that the classical laws of point mechanics 
can be put into a form analogous to the laws of geometrical optics 
(for example, the principle of least action for the motion of a point 
is analogous to Fermat^s principle for light rays). Some of these 
analogies can even be seen without having recourse to formulas. 
For instance, in a homogeneous medium, light is propagated with 
rectilinear uniform motion, just like a point particle in a field of 

« Zeits. f. Physik 48, 172 (1927). 

• Thesis presented to the University of Paris, 1924; Ann. de Physique 10, 3, 
22 (1925). 

* Ann, d. Physik 79, 361, 489, 734 (1926). These and other fundamental 
papers are collected in one volume (see Nos. 17 and 17b of the Bibliography), 



§28] 


QUANTUM MECHANICS 


65 


uniform potential. If instead the index of refraction varies from 
point to point, the light rays will be curved (such as in a mirage), 
and their velocity is no longer uniform. This is similar to the 
motion of a point particle if the potential is a function of position 
(force field). The potential in this analogy corresponds to the index 
of refraction, as we shall see. 

Now it is known that the laws of geometrical optics fail in all 
phenomena (diffraction) where screens, slits, bars, and other 
obstructions of dimensions very small or at least comparable with a 
wavelength are involved. In these cases we must employ wave 
optics instead, of which geometrical optics represents only a first 
approximation. Similarly, according to Schrodinger, classical 
point mechanics represents only a first approximation of more 
general mechanical laws. If we want to retain the parallelism 
with optics, we must consider that these laws are analogous to the 
laws of wave optics. There must then exist in mechanical problems 
a quantity corresponding to the wavelength; and when dealing with 
systems of dimensions large compared with that quantity, we shall 
be able to use classical mechanics with sufficient approximation. 
However, in systems of dimensions comparable with a wavelength, 
phenomena corresponding to those called diffraction in optics will 
occur, and hence it will be necessary to use wave mechanics. There¬ 
fore if the mechanical wavelength is of atomic dimensions (that is, 
of the order of magnitude of 1 angstrom), there may be an 
explanation of how classical mechanics applies well to ordinary 
bodies but fails in ^he interpretation of atomic phenomena. Schro- 
dinger^s idea, then, consisted in trying to construct a mechanics 
which bears the same relation to classical mechanics as does wave 
mechanics to geometrical optics. 

This theory is developed in rather singular fashion, inasmuch as 
it has at first the appearance of a purely mathematical operation 
applied to an abstract quantity satisfying an equation charac¬ 
teristic of wave phenomena. Later Schrodinger sought to relate 
this quantity to a physical model by interpreting it as an expression 
of the electric charge density. It was only later that Born® sug¬ 
gested its probabilistic interpretation, which today, in the light of 
the uncertainty principle, must be recognized as the only legitimate 
one. 

» ZeiU, /. Fhyaik 87, 863; 88, 803 (1926), and 40, 167 (1927). 
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The Schrodinger theory, even in its preliminary phase, in which 
it represented a mathematical procedure of unknown and doubtful 
meaning, immediately gave the impression of constituting an 
enormous progress over the Bohr-Sommerfeld theory and of coming 
rather close to the profound nature of things. This assurance 
depended on the fact that wave mechanics does not postulate the 
existence of privileged orbits and hence of discrete energy levels; 
the latter emerge as a consequence of a postulate that is more in 
accord with the normal convention that a certain function be finite 
and continuous everywhere. 

In this theory, the discontinuity arises in an altog(^ther natural 
manner from the mathematical procedure, in a way quite similar 
to the procedure by which, in acoustics, it is shown that a vibrating 
system can produce only discrete notes. 

With these principles, Schrodinger was able to calculate the 
hydrogen spectrum, the Zeeman and Stark ('fTects, the oscillator, 
and so on, always obtaining results in agreement with those obtained 
by Heisenberg’s matrix method. This coincidence is not accident al, 
since Schnnlinger himself soon showed that the two methods, 
although originating from v(Ty different concepts, are strictly 
equivalent. That is, they represent two different forms of the sann^ 
mathematical proc.edure and must therefore lead to identical 
results in all cases. Hence in the treatment of various problems 
we may choose one or the other method, according to circum¬ 
stances. However, for a general presentation, the Schrodinger 
method has the advantage of leaning more strongly upon intuition 
and of requiring a less exceptional mathematical apparatus; there¬ 
fore we shall use it as an introduction to the more general methods 
of quantum mechanics. 

29, Diffraction of electrons and of atoms. According to wave 
mechanics, the behavior of a beam of electrons (for example, a 
beam of cathode rays), atoms, or molecules is governed by mathe¬ 
matical laws quite similar to those which govern the propagation 
of a bundle of waves—as if the particles were guided by the latter. 
These waves, which we shall call de Broglie waveSy have a wavelength 
X which depends on the velocity v of the particles and which is 
given, as we shall see in §33, by A/p, where p is the momentum (a 
relation analogous to the one which holds for photons; see §8). For 
velocities small with respect to c this formula becomes 
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where m is the rest mass of a particle. In particular, it can be 
deduced that if the beam encounters a grating, it must be diffracted 
like a light beam. But since it is practically impossible, for experi¬ 
mental reasons, to obtain a well-defined beam of very slow particles, 
the wavelength actually cannot be made to exceed a certain limit, 
which is of the order of several angstroms (corresponding to veloci¬ 
ties of a few tens of volts). Therefore a parallel must be drawn to 
X rays rather than to light waves, properly speaking; and hence 
for diffraction we must normally use a crystal instead of an optical 
grating, in analogy with the experiments of Lane and of Bragg.® 

Apparently this idea was first suggested by Elsasser, in 1925, 
soon after the appearance of the first works on wave mechanics. 
But the experiment on electron diffraction by a crystal was per¬ 
formed for the first time by Davisson and Germer, who made it 
famous in 1927 and who succeeded thus in measuring the ^‘electron 
wavelength’’ as a function of the velocity, finding formula (26) to 
be fully verified. This discovery, which profoundly altered the 
usual concepts of the nature of the electron, constitutes the most 
direct and most remarkable confirmation of Schrodinger’s wave 
theory. The experiment of Davisson and Germer was further 
modified and perfected by many other investigators, until today 
there exists a techniciue, in a certain sense parallel to the spectral 
analysis of X rays. In a few cases it has even been found advan¬ 
tageous to employ the diffraction of electrons of known wavelength 
for a study of various problems of structure, as has been done for 
some time with rays on a very large scale. In view of the smaller 
penetration of electrons, the new method is particularly well 
adapted to the study of the surface layers of the diffracting body. 

It has furthermore been verified that diffraction phenomena 
arise not only from electrons but also from all material particles 
directed in a beam upon a crystal. This assumption was proved 
experimentally by Stem and his collaborators and by Johnson and 
others, who used molecular beams of hydrogen molecules, atomic 

® Nevertheless, Rupp [Zeits. f. Physik 62, 8 (1928)] has succeeded in obtain¬ 
ing the diffraction of electrons from 70 to 300 volts by an optical grating, used 
with extremely grazing incidence, and to verify the predictions of diffraction 
theory within 2 or 3 per cent. 
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hydrogen, and helium. For these particles as well, the validity of 
formula (26) has been confirmed.'^ 

30. Experiments performed by the Laue method. We shall now 
outline the first memorable experiments of Davisson and Germer.^ 
The electrons were emitted (Fig. 11) from a filament of incan¬ 
descent tungsten F, and were accelerated by a field established 



between F and the diaphragm D by battery B. By varying the 
potential of the latter, it was possible to vary the velocity of the 
electrons (in the experiments described, the potential was varied 
from 30 to 370 volts). The diaphragms D and D' served to define a 
narrow beam of electrons incident normally upon K, a nickel crystal 
cut parallel to one face of the octahedron.® The electrons were 
reflected by the crystal, and their distribution in various directions 
was studied by collecting them in a Faraday cup C (protected by a 
screen against electrostatic influences and provided with a narrow 
aperture so as to collect only the electrons scattered in a given 
direction), which was connected to a galvanometer g. By dis- 

^ For particulars on the electron and molecular diffraction experiments, and 
for many references, see Bibliography No. 30 and also the monograph of 
J. J. Trillat, Les preuves experimentales de la micanique ondulatoire (Paris: 
Hermann, 1934). 

*Phy&. Rev. 80, 705 (1927); J. Chem. Ed. 6, 1041 (1928). 

® Nickel crystallizes in a monometric system, and the crystal lattice is of 
the face-centered cubic type. The faces of the octahedron are designated 
by (111)-. 
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placing the collector C along the graduated arc cc' and observing 
the deflections of the galvanometer for the various positions, we can 
study the distribution of the diffracted electrons at various values 
of the angle d. Then in order to study the distribution at different 
azimuths, it was possil)le to rotate the crystal about the axis xx' and 
to determine its orientation by means of the graduated circle mm'. 
Of course, the whole apparatus was enclosed in a highly evacuated 
vessel. 

This experimental arrangement is analogous with the one of 
Lane for X rays (the only difference being that the diffracted beams 
were observed on the same side of the crystal instead of on the other 
side), and its theory can be established on the same basis that is 
used for the interpretation of the Laue experiment. However, 
there is one important difference between the two cases. In the 
electron case it is necessary to take into account the fact that the 
wavelength inside the crystal has a value X' different from the 
value X that it has outside, because of the different propagation 
velocity of the de Broglie waves within the material and in vacuo. 
This difference, however, is negligible in the case of X rays, for 
which the index of refraction for all substances is practically equal 
to unity. The necessity of taking this fact into account was 
pointed out by Bethe, who showed how the experiments of Davisson 
and Gerraer, when interpreted correctly, were in full accord with 
the Schrodinger theory. The existence of an index of refraction, in 
addition to being foreseen for theoretical reasons, is thus demon¬ 
strated experimentally; and the experiments of Davisson and 
Germer also permit us to calculate the value which we shall call /a 
and which comes out larger than 1, and tends toward 1 with an 
increase of v. Therefore, when establishing the theory of these 
experiments, we shall remember that inside the crystal the wave¬ 
length will not be X but 

X' = ^ (27) 

We shall now consider a beam of electrons incident normally 
upon the surface ss of the crystal (see Fig. 12). Let us select one of 
the planes which may be drawn through the crystal, forming, for 
instance, an angle <p with the surface. It is known from Bragg’s 
theory that this and all other planes parallel to it behave like 
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partially reflecting surfaces. In general, the plane waves reflected 
by them will interfere destructively, except for the case where the 
wavelength X' has a value such that Bragg^s relation is satisfied: 

7iy = 2d cos (p, (28) 

where d is the distance between two adjacent crystal planes and 
n is an integer, or else [see (27)] for the wavelengths X such that 

nX = 2d/x cos <p. (29) 

In this case we have selective reflection. 

Let us suppose then that X has one of those values for which 
reflection occurs, and let us look for the direction in which the 



reflected waves will travel outside the crystal. Upon leaving the 
crystal they suffer a refraction, and hence the emerging ray (normal 
to the emerging waves) forms an angle B with the normal to the 
surface given by 

sin B _ 
sin 2p 

since the angle of incidence upon the surface is 2«^, as can be seen 
from the figure. 

Hence we have sin ^ == 2 /li sin ip cos ip. 

Solving for cos ip and substituting into (29), we get 

n\ = d (30) 

sm 

Calling D the distance between the intersections of the two 
crystal planes with the surface ss (which is a constant of the crystal, 
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known from X ray studies), we get, as seen from the figure, 

d == D sin ip, 

and hence (30) l:)ecomes 

7iX — D sin 6, (31) 

fi does not enter in this equation, which is the same that holds 
for X rays. Thus it is possible, from a measurement of 0, to obtain 
X without making any assumption concerning the index of refrac¬ 
tion IX. 

But if we want to investigate for what wavelengths a given 
system of crystal planes gives rise to selective reflection, we must 
make use of (29), in which g does enter. Thus the select ive reflection 
occurs for wavelengths (iijTercni from those for which it occurs in 
X rays; but for eacdi of these the reflected beam has the same 
direction that it would have if w'e dealt with X rays of the same 
wavelength. 

As we know% a great number of crystal planes exist in a crystal. 
In practice, however, only those in whicii the atoms lie sufficiently 
close togetlua* ai*e effective in diffraction. To each of theses corre¬ 
sponds a value for the right-hand side of equation (28) that is a 
constant characteristic of the system of crystal planes. 

Therefore if we dirc^^t the rays normally upon the crystal in the 
manner descrilx'd, v/e shall in gcmeral not have any diffracted beam, 
except for the case in wiiich X' has a value siuii that, when multiplied 
by an integer ?i, it gives a value coinciding with the characteristic 
constant of an actually effective crystal plane. Hence the experi¬ 
ment must be performed in the following manner. Fixing a value 
for the accelerating potential of the electrons, and thus for their 
velocity, and for X', w^e investigate the distribution of the reflected 
electrons by bringing the collector C into various positions. In 
general we shall find a continuous distribution (a phenomenon 
analogous wdth that of optical scattering, w^hich is always super¬ 
imposed upon diffraction). Then we vary the potential, and 
hence X', and again determine the distribution, and so forth until a 
potential is found for which a maximum occurs in a certain direction, 
or else until diffraction is superimposed upon the scattering. Equa¬ 
tion (31) then permits us to find X, and we have the wavelength 
corresponding to a given potential and hence to a given velocity. 
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Knowing X, we can calculate the index of refraction fx corresponding 
to that wavelength by means of equation (29). 

Figure 13 represents the results of such a series of experiments. 
Each polar diagram is obtained by laying off in each direction a 




distance proportional to the number of electrons scattered in that 
direction, and the various diagrams correspond to different veloci¬ 
ties of the incident electrons. As can be seen, for a velocity of 54 

volts there appears a well-defined 
diffracted beam in the direction 
e = 50 ^ 

31. Experiments performed by 
the Bragg method. Davisson and 
Germer have also performed some 
measurements by a method analo¬ 
gous to the one used by Bragg with 
X rays. 

The electrons are made to hit 
the crystal, not normally but at an 
angle of incidence d (Fig. 14). The 
direction of propagation, in the 
interior of the crystal, makes an angle 0' with the normal to the 
surface, given by 

sin Q 





Fig. 14 


sin = 


(32) 


The waves reflected from various crystal planes parallel to the 
surface ss as a rule interfere destructively, except for the case where 
Bragg's relation is satisfied: 

nX' = 2d cos 


(33) 
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where d is the distance between the crystal planes, measured per¬ 
pendicular to the surface, and X' = X/jx. 

Solving for cos d' from (32) and substituting it into (33), we 
obtain 

nX' = — - sin" B, 

and hence n\ = 2d — sin'-^ 6. (34) 


In Bragg’s experiments with X rays the crystal was made to 
oscillate so as to vary 6 over a certain range, and a regular reflection 
was produced every time 6 passed through a value satisfying (34). 
However, in the experiments with electrons it is more convenient 
to keep the angle of incidence 6 fixed and to vary X continuously by 
varying the accelerating potential V of the electrons. 

Davisson and Germer always used a nickel crystal cut along a 
(111) plane. Upon drawing the polar plot of the electron distribu¬ 
tion in various directions, for different values of F, they found that 
there generally exists an almost uniform distribution, except for 
some values of V for which quite a pronounced maximum is found in 
one direction, corresponding exactly to the law of optical reflection. 

If the index of refraction of the crystal for the electron waves 
were 1, equation (34) would be reduced to the Bragg relation. That 
is, regular reflection would be obtained for those values of X for 
which nX == 2d cos 0, or else 


1 

X 


1 

—-- 

2d cos B 


= const. X n. 


(35) 


Figure 15 shows a diagram of the intensities obtained in the 
direction corresponding to regular reflection (with an angle of 
incidence of 10°) as a function of 1/X, X being calculated from the 
potential V by means of (26). The arrows indicate the points where 
the maxima should fall if (35) were valid. The influence of the 
index of refraction can well be seen from the figure: though (35) is 
sufficiently well verified for short wavelengths, because in that case 
the index of refraction is close to unity, the maxima for large wave¬ 
lengths are considerably displaced. This displacement may be 
used to calculate ju as a function of X, or else to establish experi¬ 
mentally for the de Broglie waves a law analogous to the law of 
optical dispersion. 
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Later experimenters in electron diffraction, using methods 
similar to those of Laue or Bragg and employing crystals of different 
substances, obtained results in essential agreement with those 
described above. 



32. Experiments performed by the Debye-Scherrer method. 

The method of Debye and Scherrer for the analysis of X rays and 
the study of the structure of crystals consists, as is well known, in 
letting a thin pencil of X rays fall on a crystalline material in 
powder form and in observing the diffracted rays. Since in the 
powder small crystals are found oriented in all directions, there will 
always be some crystals for which relation (35) is satisfied and which 
give rise to diffracted rays forming a cone whose axis lies along the 
direction of the incident beam. Consequently on a photographic 
plate placed beyond the tube containing the powder, there will be 
concentric rings from whose radii the wavelengths present in the 
beam may be deduced. 

This method has been applied to electrons by G. P. Thomson, 
Rupp, and others. From many points of view, it has proved to be 
more advantageous than that of Davisson and Germer. By means 
of the Debye-Scherrer method, Thomson has been able to extend 
the wavelength measurements to electrons considerably faster than 
those of Davisson and Germer, attaining potentials up to 20,000 
volts. Rupp performed the same experiments with slow electrons 
(several hundred volts), detecting them either photographically or 
by the electrometer method. 
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33. Velocity dependence of the wavelength. Theoretical con¬ 
siderations suggest, as we shall see in §26 of Part II, that the wave¬ 
length X should depend upon the velocity v of the electrons according 
to a law of the type 


X = 


a 

rnv 


(36) 


where a is a constant. Now the velocity v is related to the accel¬ 
erating potential V by the kinetic energy relation, which for veloci¬ 
ties small compared with c is 


cV = 3"2 

Hen(;e we see that (36) is equivalent to 


X = 


1 


\/2em \/F 


(37) 


(38) 


or else, calling Xa the wavelength in angstroms and Fv the potential 
in volts, 

/300 a 


Xa = 10^ 


4 


2em 


(38') 


The validity of the formula'" 


Xa = 


\2.25 


(39) 


has been confirmed by the measurements of the wavelengths Xa (in 
angstroms) as a function of the accelerating voltage Vv (in volts) 
performed by the various methods described. The precision to 
which this law is verified is shown in the diagram of Fig. 16, in which 
the values measured for X are plotted as ordinates corresponding to 
values of l/\/Vv along the abscissa. The straight line is described 
by equation (39), and we can see that the experimental points lie 
quite close to it (within experimental errors). 

By identifying (38') with the experimental relation (39), we 
obtain for the constant a the value 


a 


12.25 X10~8, 


See, for instance, Nos. 30 and 35 of the Bibliography. 
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and upon introducing the values of e and m, we find that a = 6.614 
X Thus we are justified in identifying a with Planck's 

constant A ( = 6.6237 X 10“^^ erg-sec), after which (36) becomes the 
already mentioned relation (26) between X and v, which we can 
therefore consider to be based upon solid experimental foundations. 



It should be added that the preceding considerations are valid 
if the velocity v is small compared with c. Otherwise the kinetic 
energy law (37) becomes, according to the theory of relativity, 

and the mass m becomes a function of v according to the law 


m = 


mo 

-x/l — (v^/c^) 


(41) 


Nevertheless (26) still holds, if the expression (41) is inserted for m. 
Therefore the dependence of X upon v and hence upon V becomes 

Independently of these, de Broglie had given a theoretical demonstration 
of (26) starting from first principles, based upon relativistic considerationa 
(see No. 5 of the Bibliography). 
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somewhat more complicated.^^ The validity of (2G), even for high 
velocities, has been verified by the experiments performed by Ponte 
with fast electrons (up to 17,250 volts) using the Debye-Scherrer 
method; he pushed the precision in the measurement of X up to 
3 parts per 1000 (the relativistic correction amounts to about 1 pc^r 
cent for 10,000-volt electrons). More recently, Tappert^^ has 
extended the verification of the relativistic formula for X up to 
04,000-volt electrons, with an accuracy of 1 part per 1000. 

34. Logical aspect of atomic mechanics. The operational defi¬ 
nition of physical concepts. The electron diffraction experiments 
described in the preceding sections have clearly demonstrated that 
not only electromagnetic radiation but also so-called corpuscular 
rays, such as cathode rays, exhibit a double characiter; that is, they 
conform to a wave model in certain phenomena, to a particle model 
in others. Thus an even stricter parallelism is established between 
radiation and matter, and the problem of the connections between 
photons and electromagnetic waves (see §13) is transferred without 
change to the problem of the connections between electrons and 
de Broglie waves. In the case of radiation, phenomena of a wave 
nature were discovered first, then those of the corpuscular type; 
in the case of the electrons the reverse has taken place. It is only 
for this historical reason that we have become accustomed to con¬ 
ceive of radiation mainly in the wave aspect and of electrons in 
corpuscular form. The usual language still somewhat favors this 
tendency, since we speak more of light waves than of photons, more 
of electrons than of de Broglie waves. But in order to be perfectly 
objective it is necessary, in the case of electromagnetic radiation as 
well as in the case of cathode rays, to attribute equal importance 
to the properties of the wave type and to those of the particle type, 
and hence to convince ourselves that neither the wave nor the 
particle picture, as usually understood (that is, in analogy with the 
waves and particles visible to the eye), completely represents the 
nature of these physical entities. 

Instead, the dependence of X upon momentum p is in every case the simple 
inverse proportionality; that is, 

h 

X « 

V 

since p ^ mv even in relativistic mechanics. 

Rev, W, 1085 (1938). 
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For some time this fact presented an insurmountable difficulty, 
since it seemed that we had to deal with entities gifted with con¬ 
tradictory properties. The solution of this apparent contradiction 
was attained only through a process of logical clarification of the 
fundamentals of physics, which in turn came with the formal 
development of quantum mechanics. This development has been 
guided above all by the following logical fundamental criterion, 
which dominates, more or less explicitly, all the works of revision 
of physical principles; these works may be considered to have been 
initiated by tlie theory of relativity, and have been continued in a 
wider and more profound way in the new atomic mechanics. The 
objective of physics is to perform observations and experiments, 
and to coordinate their results in as simple a scheme as possible. 
Hence its immediate objectives are the observational data. But 
it is permissible to introduce other concepts for the interpretation 
of these data provided that such concepts be definable and detect¬ 
able by means of experiments which, if not practicable, are at least 
conceptually possible^ that is, not forbidden by any physical or 
logical precept. Quantities or relations that cannot be detected 
by means of such ideal but conceptually possible experiments may 
not be introduced into the reasoning without running the risk of 
being in error. Thus every concept of physics must be susceptible 
of an operational definition; that is, it must be definable by means 
of a series of conceptually possible physical operations. As one 
example, simultaneity is defined by indicating a method by which 
to decide whether two events are simultaneous or not. As another 
example, the electron is defined by indicating a way of detecting it; 
the coordinates of an electron, by indicating a procedure to measure 
them; and so forth. The only physical questions that have a mean¬ 
ing are those which ask for the result of one or more experiments 
which, at least conceptually, could be performed. 

It is now appropriate to clear up the concept of conceptually 
possible experiments. These are not only experiments which may 
effectively be realized but also the limiting cases of actually possible 
experiments, provided that what prevents the limit from being 
reached is not a general law but a mass of practical difficulties. For 
example, perfectly reversible cycles have been used in thermo¬ 
dynamic reasoning for some time. This practice is permissible, 
since although such cycles may not be obtained in practice, there 
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is still no theoretical reason why they should be impossible; and the 
more the techniques are perfected, the inorc^ closely these conditions 
are approached. Of course, the discovery of now general laws may 
modify certain of these opinions in the future by showing that an 
operation whi(th at first was thought to be conceptually possible is 
not, or vi(;c versa. Therefore, in speaking of concc^ptuall}^ possible 
operations, we imply ^'according to the physical laws now known.’' 
Making use only of such operations for the definition of concepts, 
one is at least certain not to introduce unwanf ed contradictions with 
known physical laws, and hence not to construct a logically in(;o- 
herent structure, as may happen—and has hai)pened many times— 
if concepts are introduced without being concerned with giving 
their operational definition. A famous example is the concept of 
absolute time, us(al in pix'relativistic physics wit.liout a rcxilization 
that its operational definition recpiires the instantaneous transmis¬ 
sion of signals which, within the realm of hitherto known physical 
phenomena, is conceptually impossible. The point of departure 
of the relativity theory has indeed been the analysis of the concept 
of absolute time, and its operational definition. The same critical 
methods, when appli(id to atomic phenomena, lead to the uncer¬ 
tainty principle and to the new quantum mechanics. 

The rigorous application of this criterion leads to more surprising 
results than could be imagined at first sight. The greatest benefit 
derived from it has been the clearing of the field of many questions 
that, upon an accurate analysis made with operational criteria, have 
proved to be devoid of physical meaning. Furthermore, it has 
enabled us to recognize that the paradoxical contradiction between 
the corpuscular and wave aspects of radiation, and the analogous 
contradiction more recently found in connection wifii matter, do 
not contain any actual logical contradiction. These apparent 
contradictions arise solely from our tendency to extend to the atomic 
domain those geometrical, kinematic, and causal representations 
which have been formed in our minds by means of the continual, 
unconscious experience accumulated through the observation of the 
macroscopic world, which we call intuitions.^' We are thus led to 
introduce, into the atomic domain, concepts that are not capable 
of operational definition; hence the above-mentioned apparent 
contradictions. In other words, when speaking of particles (elec¬ 
trons or photons), we are led to imagine them as extraordinarily 
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reduced material balls, and hence we are led to attribute to such 
entities all the geometrical and kinematic properties that ordinary 
bodies possess (such as continuity of trajectory), without realizing 
that in this manner we are postulating for the particles properties 
which are by no means experimentally proved, or even logically 
necessary. It is therefore not surprising that these properties are 
in contrast with those determined from experience. But if instead 
we associate with the concept of an electron or photon only properties 
that may be determined experimentally, we can verify that these 
properties, as is most natural, are not in logical contradiction with 
each other. 

Hence there is no reason whatever why there should exist a 
geometrical or mechanical model accessible to our intuition and 
capable of completely interpreting all the phenomena of the micro¬ 
cosm. This is not to deny that models (such as the one of Ruther- 
ford-Bohr-Sommerfeld) may be in many cases of great use as a 
heuristic means as well as a concise means of expression, often 
furnishing exact, or at least approximate, interpretations of large 
classes of phenomena. However, they should not be taken literally 
but should be considered in the same light as hydrodynamic 
analogies, which aid so substantially in the understanding of elec¬ 
tric current phenomena. 

Furthermore, quantum mechanics also furnishes the explana¬ 
tion of the success of some of the geometrico-mechanical models in 
the interpretation of atomic phenomena. In fact, the laws of 
quantum mechanics retain some of the principles of ordinary 
mechanics, such as the principle of the conservation of energy and 
momentum. Besides, there is a continuous transition from the 
laws of micromechanics to those of macromechanics; the latter may, 
in many cases, serve as an approximate (or even exact) expression 
of the former. 

36. The operator method and the Dirac theory of the electron. 

P. A. M. Dirac and P. Jordan have made a great contribution to the 
clarification of the logical aspect of atomic mechanics by working out 
the entire subject once more from an original point of view, based 
above all upon the physical concept of state and the mathematical 
concept of operator. They were able in this way to construct a 
unified and very general method of treatment for atomic problems 
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{transformation theory)^ of which the matrix method and the method 
of wave mechanics are but particular cases. 

Another remarkable result obtained by Dirac was the ability to 
show that, in the scheme of quantum mechanics, the mechanical 
angular momentum and the magnetic moment of the electron are 
necessary consequences of its existence and of the relativity prin¬ 
ciple, so that the so-called ^'electron-spin hypothesis” ceases to be a 
postulate in itself, since it is deduced as a consequence of the funda¬ 
mental equations. This theory wull be developed in Chapter 14. 




Part 11 

WAVE MECHANICS OF A PARTICLE 




CHAPTER 5 

Mathematical Introduction 


1, General remarks on linear homogeneous differential equa¬ 
tions of the second order. In wave mechanics, differential equa¬ 
tions (involving ordinary derivatives) which are linear, homogeneous 
and of the second order are of great importance; these are equations 
of the type 

Aix)y'' + B{x)y' + Cix)ij = 0, (1) 

where the coefficients A, By and C are analytic functions of x, and 
are assumed to he real for real values of .r. Since A is not supposed 
to be identically equal to zero, we can always write the equation 
in the form 

T/" + P(x)t/' + Q{x)y = 0, (1') 

where we have put 


First of all, we realize that any solution y{x) is certainly regular 
(or else may be developed in a series of integral, positive powers 
of x) for all values of a; for which the coefficients P and Q are regular. 
Only at points where at least one of these coefficients has a singular¬ 
ity may a singularity for y exist. Such points, called singular 
pointSy or singularities of the equation, are of considerable impor¬ 
tance in the study of its properties. For the moment, however, we 
shall exclude them from our considerations, reserving the right to 
speak of them later on. 

A well-known fundamental property of these equations is the 
following: once two particular solutions yi{x) and y^ix) which are 
independent (that is, whose ratio is not a constant) have been found, 
the general solution is found by forming a linear combination with 
them, using two arbitrary constants Ci and this combination is 

y{x) = ciyi{x) + C 2 y 2 {x). (2) 

We then say that yi and y^ have been taken as fundamental 
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integrals. Of course there is a large degree of arbitrariness in their 
choice, since we may replace them by any two of their linear com¬ 
binations, provided that they are independent. 

In actual problems, we must deal with the two constants Ci and 
C 2 in such a way that y will satisfy two other conditions imposed by 
the problem: for example, that at a given point x = a, y and its 
derivative y' take on certain determined values (we see immediately 
that this is always possible and that y is thereby uniquely deter¬ 
mined); or that y take on two given values at two given points 
a and 6, or else that tw^o given relations are to hold between the 
values of y and t/' at points a and h. In many cases (but not 
always) such conditions in conjunction with the equation determine 
the function y uniquely. In general, the two points a and h lie on 
the real axis and are at the extremes of the interval within which 
the function y{pc) is of interest; therefore we speak of boundary 
conditions. We are mainly interested in the two following types of 
boundary conditions: 

(a) y is to vanish at both extremes: 

y{a) = 0, y{h) = 0; (3) 

{0) y is to assume the same values at the two extremes, and 
similarly y': 

y{a) = y{h), y'{a) = y'{h). (4) 

Both of these types are contained in the category of ^^homogeneous 
conditions, to which many of the following considerations may be 
extended. They have the property that if y{x) is a solution satis¬ 
fying them, then cy{x), where c is an arbitrary constant, is also a 
solution satisfying the same conditions. 

We find that conditions (a) and {0) are satisfied, together with 
the differential equation, if we take y — 0 over the whole interval, 
or else if we make both constants of (2) equal to zero. This solu¬ 
tion is evidently of no interest, so that our efforts will be directed 
toward finding nonzero solutions; that is, the case in which the 
boundary conditions in (a) or (p) do not uniquely determine a solu¬ 
tion of the given equation is of interest. We shall see that this 
case is exceptional. 

Let us consider, for instance, the case of conditions (a). Using 
expression (2) for the general solution, we are to find two values, 
not both zero, for ci and C 2 such that 
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ciyi{a) + C2y2{a) =01 

ciyi(b) + C2y2(b) = 0 . J ^ ^ 

It is known from algebra that this system of homogeneous equations 
has nonzero solutions only if 


yiid) 

Viib) 


y^ia) 

y2{b)\ 


= 0 . 


( 6 ) 


This equation relates the values of the two fundamental integrals 
2/1 and 2/2 at the points a and 6; of course we obtain an equation of 
the same form if the solutions yi and 2/2 are replaced by two other 
fundamental integrals. 



If we can explicitly express ^i(x) and y^ix) as functions of the 
coefficients of the equation, (6) immediately takes the form of a 
condition which these coefficients must satisfy for the equation to 
yield a nonzero solution for the given boundary conditions (and 
hence an infinite number of nonzero solutions, obtainable from the 
latter upon multiplying by a constant factor). Although, in general, 
no such explicit expression for 2/1 and y^ exists, (6) always expresses, 
in implicit form, -a condition (of a functional type) for the coef¬ 
ficients in question. By similar reasoning for the case of condi¬ 
tions (/3), we find, instead of (6), the condition 


yi(o) - 2/1(6) 
y'M - y[ib) 


y^ia) - 2 / 2 ( 6 ) 
2/^(a) - y'%{b) 


= 0. 


(7) 


In the case of conditions (a), condition (6) may be expressed in 
an intuitive form in the following manner. Each solution of the 
given equation, in the real domain, can be represented graphically 
by means of a curve in the a;^-plane (see Fig. 17). Specifying the 
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value at one point and the tangent there is enough to specify the 
curve completely. Now let us consider the family of all solution 
curves originating from the point A (x = a) in various directions. 
It is easy to show that if one of them is given, any other may be 
obtained from it by means of a dilation or contraction along the 
y-Rxis. For example, let y = f(x) be the curve starting at A with a 
slope 1 [/'(a) = 1]; then y — X/(x) (where X is a constant) will be 
another solution and will be represented by the curve starting from 
A with a slope X. By giving to X a convenient value we can thus 
obtain any one of the solution curves originating at A. It follows 
that all the curves starting from A cut the or-axis at the same points 
Ni, N 2 y • • • (nodes) corresponding to the roots of the equation 
f(x) = 0. In other words, by varying the slope of the tangent at 
A we do not displace the nodes, whose position, since A is fixed, is 
determined solely by the nature of the equation—that is, by its 
coefficients. If we try in this way to find a solution satisfying con¬ 
ditions (a )—that is, such that its curve passes, in addition to A, 
through another predetermined point B (x = h )—we find that in 
general such a solution does not exist. It exists only if one of the 
nodes iVi, N 2 , . . . falls at 5; and in that case, of course, the 
infinite number of solution curves issuing from A will satisfy the 
required condition. Now the condition that one of the nodes 
coincide with J5, or that f(h) = 0, is equivalent to the condition 
found above in the form of equation (6). 

Note. In the preceding argument y was supposed to be real, 
but it may be observed that every complex solution of (1) satisfying 
(a) or (/?) at the extremes is composed of a real and an imaginary 
part, which satisfy the equation and the boundary conditions 
separately; therefore the reasoning can be applied to each of these 
parts. 

2. Equations containing a parameter. Eigenvalues and eigen*- 
fimctions. Many cases of interest in wave mechanics and, in 
general, in the theory of oscillations of any kind, contain a parameter 
(that is, an undetermined constant) X in the coefficients of the 
differential equation (1) or (!'). The most interesting case is the 
one in which X occurs linearly in the coefficient (7, so that the equa¬ 
tion may be written in the form 


A(x)y" + B(x)i/ -f [Xa(z) + fi(x)]y =» 0. 


(8) 
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The fundamental integrals t/i and will then also contain the 
parameter X, and hence the left-hand member of (6) or of (7), which 
we shall designate by A(X), will also contain it. Thus there appears 
the possibility of determining X in such a way as to satisfy (G) or (7), 
and hence there may exist solutions of the proposed differential 
equation satisfying the desired boundary conditions (a) or iff). 
For this purpose, X must be a root of the (generally transcendental) 
equation: 

A(X) = 0. (9) 

Such values of X are called the eigenvalues of the differential 
equation, relative to the interval (a, h) and to the boundary condi¬ 
tions {a) or (/3). Once we have found an eigenvalue Xi, relations 

(5) or similar ones allow us to find Ci and in such a way that y 
satisfies the desired boundary conditions. [We observe that once 

(6) is satisfied, equations (5) are a consequence one of the other, so 
that they determine only the ratio c^/ci, and we may choose Ci at 
will.] Such an integral of the differential equation is called an 
eigenfunction belonging to the eigenvalue \i. 

It should be noted, however, that except for special cases, we 
cannot, in practice, determine the eigenvalues and eigenfunctions 
in this manner, since in order to put down equation (9), it is neces¬ 
sary to know the general solution of the differential equation. The 
above considerations serve only to clarify the concept of eigenvalues 
and eigenfunctions and to make their actual existence plausible. 
The problem of their determination is a fundamental one in wave 
theory. No general methods for its solution are known, only 
artifices that are successful in many cases. Some of these artifices 
will be applied in what follows. 

3. Self-adjoint form of an equation. A differential equation of 
the type (1) is said to be in the self-adjoint form if between the 
coefficients A(x) and B{:ic) there exists the relation B = A', so that 
the first two terms together form the derivative of Ay\ and the 
equation may be written 

£ (Ay') + Cy = 0. (10) 

It should be noted that any equation of the type (1) can be put 
into self-adjoint form. In fact, if we multiply (1) by the nonzero 
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factor 

^ {B/A)dx 

A ^ 

the three coefficients B, and C become, respectively, 

P ^ g/(B/^)dx^ Q = p ^ ^eJ(«/-4)dx^ (H) 

A Jx 

and we can verify immediately that Q = P\ so that the equation 
can be written 

£ {Py') + % = 0. (12) 

If then C contains a parameter X linearly, as in (8), so will 72, 
and equation (12) will have the form 

{Py') + [V(^) + y{^)\y = o. (13) 

Now the case which occurs most frequently is the one in which 
p is a constant, which may be taken equal to 1, so that the equation, 
reduced to its self-adjoint form, becomes 

£ (Py') + [\ + q{x)]y = 0, (14) 

where P is always positive, as a result of (11). In what is to follow 
we shall refer in general to equations of this type, except when 
otherwise stated. Nevertheless, it should be remarked that the 
presence of the coefficient p would not introduce substantial modi¬ 
fications (at least if p > 0 over the whole interval considered) but 
only a slight complication of the formulas. 

It can be shown^ that an equation of type (14), with boundary 
conditions (a), always possesses an infinite number of real eigen¬ 
values, and that this infinity is discrete^ that is, denumerable, as 
long as the interval (a, b) is finite, as has been assumed here. 
Furthermore, if P is positive over the whole interval, the succession 
of eigenvalues is limited from below but has infinity for upper limit. 
To each of these eigenvalues there correspond one or, at most, two 
linearly independent eigenfunctions (never more than two, since 
any other solution can be expressed as a linear combination of these 
two, as is well known). 

^ Cf. No. 25 of the Bibliography, 
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4. Normalization of the eigenfunctions. Since any eigenfunction 
may be multiplied by an arbitrary constant without ceasing to 
satisfy the required conditions, there exists for each of them an 
infinity of other eigenfunctions, not independent of each other, 
which it is not of interest to consider as distinct solutions. To 
remove this indeterminacy, it is customary to add another condi¬ 
tion, which is called normalization condition. This is 

T I 2 /I" dx = 1, 

or else (if we designate, as we shall often do, the complex conjugate 
by an asterisk) 

dx (15) 

This condition can always be satisfied, since, given a Y{x) which 
is an eigenfunction not satisfying it, we have only to divide the 
eigenfunction by the nonzero constant 

iV = V// YY"’ dx 

{normalization factor), since the eigenfunction y — obtained 
in this manner satisfies (15). 

It should be noted that (15) is also satisfied if we multiply 2 / by a 
complex constant of modulus 1 (that is, by a factor of the form 
The normalization therefore does not completely determine the 
eigenfunction. Nevertheless, for most eigenfunctions we ignore 
this arbitrariness, which has no influence at all upon the modulus 
of the eigenfunction (see footnote on page 171). 

6. Orthogonality of the eigenfunctions. We shall now show a 
fundamental property of the eigenfunctions. 

Let us consider two eigenfunctions and 2 /m of (14), obeying 
conditions (a) and belonging to two distinct eigenvalues Xn and X^. 
They will identically satisfy the two relations 

£ iPy'n) + (Xn + q)yn = 0, (16) 

^ (PyL) + (Xn. + q)y^ == 0. (16') 

The second relation will also be satisfied by y*, the complex con- 
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jugate of y„ (the coefficients being real); that is, we obtain 

{Pyt) + (X™ + q)yl = 0. 

Multiplying through by ijn, and (16) by y*, and .subtracting, we get 

yl ^ {PyD - 2/» (.Pvm) + (x„ - x„,)2/„2/* = o. 

We can immediately verify tliat the two first terms together are the 
exact derivative of PiytVn ~~ !Jn!lZ')y that, by multiplying the 
entire equation by d.r and integrating Ixdwc^en a and 6, we have 

[PiyWn - 2/n?/*')]a + (Xn - X„) j%„y*dx = 0. 

Since at the limits both ?/« and /y*' vanish, the first part is zero. As 
long as we suppose that Xfj. 7*^ Xflj, wc are left with 

£ynyZdx = o. (17) 

Such a relation, of the integral type, betw^een the tw^o eigenfunc¬ 
tions 2/n and ym is called (for reasons which will be explained in 
Chapter 10) an orthogonality relation. Thus we can say that two 
eigenfunctions of (14) which vanish at the end points and which belong 
to different eigenvalues are orthogonal. 

The same property evidently still exists if the boundary condi¬ 
tions are of type (^), since the coefficient P takes on the same values 
at a and h. 

We observe that (17) implies the conjugate relation 

j^tyv, dx = 0. (17') 

The orthogonality relations and normalization conditions can 
be expressed by a single formula by introducing the often used and 
very convenient symbol bum with the meaning 

_ I 1 for m n 
[ 0 for m 7 *^ n. 

Then (15) and (17) are combined in the formula 



m 
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6. Degeneracy (multiple eigenvalues). As we have said, one or 
two linearly independent eigenfunctions may correspond to a given 
eigenvalue. If only one corresponds to it, all other solutions 
belonging to this eigenvalue are obtainable from this solution by 
multiplying it by a constant. If, however, the normalization con¬ 
dition is added, we can say that to the eigenvalue there corresponds 
a single normalized eigenfunction (apart from the constant of 
modulus ], mentioned in §4). In this case the eigenvalue is said 
to be simple, since it is a simple root of equation A(X) = 0. 

Let us now consider the case- in which two linearly independent 
eigenfunctions correspond to one eigenvalue. Here the eigenvalue 
is called double: and it will be said, in accordance with an expression 
that has become customary in atomic physics, that there is degeu- 
eracy. Let these two normalized eigenfunctions be F](.r) and 
y^ioc) (we must rtunembei* that they will not in general be ortho¬ 
gonal, since the argument in §5 applies only to eigenfunctions 
belonging to two distinct eigenvalues). It is clear that from these 
eigenfunctions we may obtain an infinite number of other eigen¬ 
functions corresponding to the same eigenvalue, by forming a linear 
combination with two arbitrary coefficients. We shall now prove 
that it is always possible, in an infinite number of ways, to construct 
two of these eigenfunctions which arc normalized and orthogonal. 

In fact, a first pair of orthogonal eigenfunctions can be built up 
of Fi itself and an appropriate linear combination F 2 == ocYi -f / 3 F 2 . 
It suffices to choose the coefficients a and such that 

dx = 0, 

that is, a + ^ * dx = 0, 

which fixes the ratio of a to One can then dispose of one of these 
to get F 2 to be normalized also. From the pair Fi, F 2 there may 
then be obtained an infinite number of others by the formulas 

yi = ciiFi + C12F2, 

2/2 == C21F1 + C22F2; 

and we find immediately that in order for yi and y^ to be orthogonal 
and normalized, the coefficients must be subject to the restrictions 

* It is immediately apparent that this case may occur with boundary con¬ 
ditions (jS) but not with (a). 



94 

WAVE MECHANICS OF A PARTICLE 

l§6 


<^11^11 “ 1 “ ^ 12^*2 ~ 1 



C21C21 “ 1 “ ^ 22^22 ~ j 

(19) 


^ 11^*1 “b ^ 12^*2 ~ 0 . / 



In the case of real coefficients, these are simply the characteristic 
relations that exist, in analytic geometry, between the coefficients 
of the formulas for the rotation of the coordinate axes. These 
coefficients are known to be 

Cn = cos d, cn = sin 6, | 

C 21 = — sin 6j C22 = cos 6, j ^ 

Tor this reason, a linear transformation of coefficients satisfying 
equations (19) is called an orthogonal transforynation. 

We may therefore say that to a double eigenvalue there corre¬ 
spond an infinite number of normalized pairs of eigenfunctions 
orthogonal to one another (besides being orthogonal to the other 
eigenfunctions), which pairs are obtained from one another by 
means of a general orthogonal transformation. 

It is often convenient to consider a double eigenvalue [double 
root of the equation A (X) = 0] as two coinciding eigenvalues Xi 
and X 2 , and to make an eigenfunction that is normalized and 
orthogonal to the other correspond to each of them in the manner 
explained above. With this convention we can say that to every 
eigenvalue there corresponds an eigenfunction, all such eigenfunc¬ 
tions being orthogonal to each other. 

7. Nodes of the eigenfunctions. In many questions it is of 
interest to examine those values of x for which the eigenfunction 
vanishes (nodes of the eigenfunction). We shall therefore state 
(without proof) a theorem concerning this matter (see Nos. 25 or 34 
of the Bibliography). 

Let us consider any solution of equation (14) which vanishes 
at Aj and let it be graphically represented by one of the curves of 
Fig. 17. We have already mentioned (§1) that the position of the 
nodes depends only on the coefficients of the equation. If we now 
vary the parameter X, which is contained in them, the nodes will be 
displaced, and it can be shown that when X is being increased con¬ 
tinuously^ the nodes are continuously displaced toward the left. Every 
time one of these nodes coincides with the point B, the curve will 
satisfy the conditions (a) and hence will represent an eigenfunction, 
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and the corresponding value of X will be an eigenvalue, relative to 
conditions (a). 

From this statement we can see that every time X, upon increas¬ 
ing, passes through one of the eigenvalues, a new node enters the 
interval AB, If the (supposedly simple) eigenvalues are indicated 
by Xi, X 2 . . . in increasing order, there will be no nodes in the 
interval AB as long as X < Xi; for Xi < X < X 2 there is one node, 
for X 2 < X < Xs there are two nodes, and so on. In particular, the 
eigenfunction i/n has (n — 1) nodes in the interval AB, not counting 
the two at the end points; or, more clearly, the nodes of the eigen¬ 
function yn divide the interval AB into n parts. 

8. The equation of harmonic motion. We shall illustrate what 
we have said by means of the following equation, well known in 
mechanics and called ^‘equation of harmonic motion,with which 
we shall be concerned later on: 

1 /" + Xy = 0. (21) 

This equation evidently belongs to the type (14). In this case 
the procedure of §1 is actually applicable, since the general solution 
of (21) is known: 

= Cl + C2 (22) 

The interval over which the integration is of interest is to be 
( —i, 1). Let us consider the two types of boundary conditions (a) 
and (/?) separately. 

Conditions (a): We must have® 

y{-i) = yQ) = 0 , 

that is, 

Cl + C2 = 0, (23) 

Cl -[- C2 = 0. 

In order that this system of linear homogeneous equations in 
Cl and C2 may have nonzero solutions, we must have 

^ — i\/ X I C* ^ 

Upon expanding, this equation leads to 

* These conditions, for instance, come up in the problem of standing waves 
in a string with fixed endpoints. Another example will be seen in §38. 




96 


WAVE MECHANICS OF A PARTICLE 


[§8 


sin 2 = 0, 

or else 

2 y/U == nTT (n = 0, 1, 2, . . . ). (24) 

Therefore X must have one of the values 

(24') 

which are the desired eigenvalues. 

X having been determined, we get from either equation (23) 

Cl = C„ C2 = —Cn (23') 

where Cn is an arbitrary constant. Substituting into (22), we get 

2 /n = Cn2i sin VX* {x + 1) = Cn sin ^ + 0, 

where the factor 2?‘ has been incorporated into the arbitrary con¬ 
stant Cn. Except for the irrelevant case in which n = 0 (that is, 
in which y vanishes identically), the normalization condition then 
yields, 

VnVt dx = ICnC* = 1, 
whence \Cn\ == 

VI 

If we require j/n to be real, we must take for the eigenfunctions 

Vn = ^^8in~(x + l). (25) 

We can then immediately verify the orthogonality theorem, since 
for n 7 *^ m we have 

J ^ sin ^ (x + 1) sin — {x + 1) dx 0. 

The existence of discrete eigenvalues in this problem may be 
explained intuitively in the following manner. The solutions of 
(21) are graphically represented by exponential curves if X < 0, 
and by sinusoidal curves if X > 0. The first have at most one node; 
the latter have an infinite number of nodes spaced a distance 
x/\/X apart. If we require that two nodes are to fall at the two 
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end points, we must discard the exponentials and equate the 
interval 22 to a multiple of the distance between two nodes; this 
yields (24). The nth eigenfunction (25) is represented by a sine 
curve that possesses, in addition to the two nodes at A and 
B, (n — 1) intermediate nodes (see §7). The normalization condi¬ 
tion then determines the amplitude of the sine function. 

Conditions (/3); We must then have 


that is. 


y(-i) 

y'i-i) 


y(i), 

y'(D, 


Cl e -|- Cj gtVxJ 

Cl — C2 = Cl — ca 


These conditions may also be written 

(ci — ca) sin y/xl = 0, I 
(ci + Ca) sin \AZ = 0. I 
Hence, if Ci and Ca are not both zero, 

sin -\/\l = 0, 


from which 

■\/\l = nit 

The eigenvalues are therefore 

X„ = n^ -p. 


(n = 0, 1, 


(26) 


(27) 


)• (28) 


(28') 


If X is chosen in this way, both boundary conditions will be 
satisfied, and hence ci and ca remain arbitrary. The condition of 
normalization yields only 

Cicf -f- CacJ = 

We therefore have a case of degeneracy; the eigenvalues are 
double (except for the eigenvalue 0). 

We may obtain a pair of independent, orthogonal, and nor¬ 
malized solutions (forn 0) by taking in (22) 

V 2 l 


c-o, 


or 


Cl = 0, 
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respectively, and then 
1 


y? 


V2l 


pin(,T/l)x 


(29) 


Another solution is obtained by taking 

or Cl = 


1 1 
o ^2 c 


2 Vl' 2 VZ 

respectively, and then 

1 UTX 


1 _ 1 
2i y/l 2i 


, 5 . 1 . n-KX 


(30) 


(For the eigenvalue 0 we have instead the unique eigenfunction 
?/o = l/\/^)* From any one of these pairs we can get an infinite 
number of others by orthogonal transformation. For example, 
from (30), by the transformation (20), we obtain the pair 



with ^ arbitrary. 

We might have foreseen these results intuitively by observing 
that conditions (fi) force the sine curve to have the same ordinate 
and the same slope at A and B [and hence the interval AB must be a 
multiple of the wavelength 2w/\/\, whence (28)], but its phase 6 
at A remains arbitrary. 

9. Expansion of a function in a series of eigenfunctions. Fourier 
series.^ Let us consider the infinite sequence of orthonormal (orthog¬ 
onal and normalized) eigenfunctions 2 / 1 , 2/2 • . . , corresponding 
to the eigenvalues of equation (14), with boundary conditions (a) 
or {P) and with the convention (explained in §6) that any double 
eigenvalues are to be counted twice. We can prove that, given 
any function f(x) satisfying the same boundary conditions and also 
certain qualitative conditions which are not very restrictive, it is 
possible to expand it over the interval (a, h) in a series of the form 


f(x) = ^ fnVnix), 

n-»l 

* See No. 32a of the Bibliography, §§74-77. 


(31) 
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which is uniformly convergent in the interval (a, 6). 

Assuming this theorem,® we recognize immediately that the 
coefficients fn are given by 

/n == (x) dx; (32) 

indeed, upon multiplying both sides of (31) by t/*(x) dx (where ris 
any one of the indices 1,2 ,...) and integrating between a and by 
all terms in the right-hand member will vanish by virtue of the 
orthogonality of the t/, except the one for which n = r, which 
reduces to /r. Upon changing the index from r to n, we obtain (32). 

This procedure is recognizable as merely a generalization of the 
well-known procedure used to determine the coefficients of the 
Fourier expansion; the latter is just a particular case of expansion 
in a series of orthogonal functions, namely, of eigenfunctions of (21) 
with boundary conditions (/3). In fact, taking these eigenfunctions 
in the form (30) and incorporating the constant l/^/l into the 
coefficients /n (which we shall call On and bn), we can write the 
development (31) as 


«0 

X/ \ V / I t • nrx\ 

fix) = } I “/ 

n —0 

^ See No. 25 or 34 of the Bibliography. More generally the following 

fb 

theorem holds. If the function / is such that / \f\^ dx exists, the series (31) 

is convergent at least in the mean^ so that 

N 

n»l 

Expanding the square occurring in this equation and using (32) and the 
orthogonality property, we find the important formula of Parseval: 

N 

^ ( 31 **) 

n*“l 

which expresses the completeness of the system of eigenfunctions y». In fact, 
considering any system of orthogonal functions, even infinite, but not repre¬ 
senting the totality of eigenfunctions of a differential equation, the sign > 
rather than » will hold in (31**). 
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where, according to (32), 


CtQ = 


an = 


6 n = 


21 

i j fix) cos dx, 

1 // N • , 

j /(x) sin - 2 — tt.r. 


(34) 


This is the Fourier expansion. It represents the function f{x) over 
the interval ( —/, 1) even if the latter has points of discontinuity 
there (at which the seri(\s represents the arithmetic mean of the two 
limits from the right and from the left). In order for the expansion 
to hold, it is sufficient that the interval be divisible into segments 
over each of which / is continuous and monotonic. 

The same expansion may be obtained in a more convenient form 
by using the eigenfunctions (29), which can be collected into the 
single formula 


Vn = 


Vii 


niimrx/l) 


(35) 


provided that we make the convention that ^ may also take on 
negative values. Then, when we set /n/\/2i = Cn, (31) yields 
{Fourier expansion in exponential form) 


fix) = ^ Cn (36) 

n«» — 00 

and (32) yields the expression for the coefficients 

c, = / y(x) dx (37) 

If we calculate, by means of (36), the integral of ff* extended 
over the interval (—Z, Z), we readily find that 

so 

fl\f\^dx = 2l2\Cn\^ (38) 

n* — 00 

which is simply the completeness relation for the eigenfunctions in 
question. 
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10. Case of an infinite interval. Continuous spectrum of eigen¬ 
values. In certain problems of wave mechanics, equation (14) 
must be integrated, not over a finite interval (a, b) as we have 
assumed thus far, but over an interval infinite on one or on both 
sides—(a, oo) or ( — co). ]t will be convenient to examine this 
case as a limit of that in which integration is over (a, h) by making 
one or both endpoints tend toward infinity. It will then be recog¬ 
nized that the problem exhibits new characteristics in the case of the 
infinite interval, of which the most important is the appearance of a 
continuous spectrum of eigenvalues. With a continuous spectrum 
there may exist for X (in addition to an eventual discrete sequence 
of eigenvalues) some intervals within which any value of X is an 
eigenvalue; and, of course, to these eigenvalues there corresponds a 
continuous set of eigenfunctions, which for convenience we shall 
designate by y{x) [where the index X varies continuously, so that, 
actually, y is a function of two variables and may also be indicated 
by y (X, x)]. 

We shall only consider the case in which the interval is (~ cc, 
+ oo), and we shall confine ourselves to some considerations of an 
intuitive nature by using the example of §8, case (/?). If we start 
with one of the eigenvalues (28') and the corresponding eigenfunc¬ 
tions (29), and let I tend toward oo, we shall see that Xn approaches 
the eigenvalue 0 and that the two eigenfunctions also tend toward 
zero for any x; therefore we obtain nothing of interest. Instead, it 
is necessary, upon letting I tend toward oc , to consider eigenfunc¬ 
tions of higher and higher order; in this process n will also approach 
infinity, in such a manner® that the eigenvalue 'Kn = n^{Tr^/P) 
approaches a predetermined limit X(> 0). The two eigenfunctions 
then approach 

= ax = ^x (39) 

® In order to understand th?S process intuitively, let us imagine that the 
eigenvalues (28') are represented by points on a straight line (Fig. 18). It is 
clear from the formula that when I 

is increased, these points are dis- -,-,--- 

placed indefinitely toward the ori- O ^2 A3 

gin, becoming more and more closely 

spaced. Through a fixed point on Fis. 18 

the line, there successively pass 

eigenvalues of higher and higher order; that is, a given value of X is to be iden¬ 
tified with eigenvalues of higher and higher order. Therein lies the significance 
of letting n approach 00 at the same time as 1. 
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where a\ is a constant so far as x is concerned. In fact, if the two 
expressions are combined into one by setting oj == + \/X for the 
first and a? = — \/\ for the second, 

= Go, (39') 

We may consider this formula analogous to (35). The continuous 
variable w, which may range from — oo to + oo, takes the place of 
the discontinuous variable nir/L 

The extension of the orthogonality theorem and of normalization 
to these eigenfunctions with continuous spectrum” requires some 
attention; in fact, it is obvious from (39') that does not approach 
zero for a: ± oo, and hence that the normalization integral does 

not converge. 

Instead, it is necessary to consider, not an eigenfunction belong¬ 
ing to a definite value of X (we refer for the present only to or 
t/^2^), but the sum of all eigenfunctions belonging to an infinitesimal 
interval (Xo, Xo + AX), or the integral 

Ay = (40) 

which is an infinitesimal of order AX, and is called eigendifferential 
(by the notation Ay, we imply a passage to the limit for AX —> 0). 
Setting X = Xo + AX/2 + € and neglecting the squares of e and of 
AX, we easily find (writing X instead of Xo)^ 

X AX 
sin 

Ay = ax 

4\/X 

or else, setting \/\ = <0 as above. 



^ Since the integral (40) is extended over an infinitesimal interval AX, we 
can consider that it will reduce to a single element and may be obtained by 
writing AX instead of d\ in the integrand. This result, however, holds only 
for finite values of x, as can be seen from expression (41), since even an infinitesi¬ 
mal variation of X makes a difference in the value of the function, which becomes 
larger as x increases. 
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an expression which also extends to negative values of a>. 

The function Ay/Acj can be seen to approach 0 when a: —^ ± oo, 
a statement that can be interpreted by saying that the infinite num¬ 
ber of superimposed sine functions cancel at infinity by mutual 
interference. For finite x, however, Ay/Ao) approaches y^, as Aw 
approaches zero. 

In addition, the functions possess a property that is the natural 
extension of the orthogonality of two eigenfunctions belonging to 
different eigenvalues. In fact, if Aiy and A 22 / are the Ay’s belonging 
to two small intervals of eigenvalues (Xi, Xi + AiX) and (X 2 , X 2 + A 2 X) 
that do not even partially overlap, we have, as may be verified by a 
somewhat laborious integration, 

j^^Aiy Aip* dx = 0 . (42) 

If instead the two small intervals have a segment AX in common, 
the integral proves to be 

47raxax’*‘ \/X AX, 

where Xi and X 2 are indifferently designated by X, since they differ 
by an infinitesimal. This result suggests setting up as normaliza¬ 
tion condition 

Aiy A2y*dx =1. (43) 

Taking into account the value of the integral shown above, we see 
that the normalization constant ax must then be chosen so that 

* _ 1 
UxUX y -* 

4w VX 

We can then take as the normalized eigenfunction of (21) in the 


interval (— 00 , -f <») 


3/x = . -- 

2 Vir \/X 

(44) 

or else 


ni =z ^ 

2 V TTO) 

(440 


Generalizing what we have shown for this particular example, we 
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can say that equation (14), for an infinite interval, can possess a 
continuous spectrum of eigenvalues. The eigenfunctions belonging 
to these eigenvalues do not, in general, approach zero at infinity, but 
their eigendifferentials Az/ do. The boundary condition is precisely^ 

1 rx-fAx 

lim — / y),(x) d\ = 0 (45) 

> ± 00 J X 

and the orthogonality property is 

/-”« = 0 (46) 

for two intervals AiX and A 2 X that do not overlap. The normaliza¬ 
tion condition is 


AX 


/ ao rXi + AiX rXi + AaX 

dx / yx(x) d\ • / y\*ix) d\ = 1 (4G') 

-« Lyxi J\i J 


for two small intervals which have a segment AX in common, or 
which possibly coincide. 

If, besides the continuous eigenvalues, there are also discrete 
eigenvalues Xn, the following orthogonality property holds between 
the eigenfunctions of the continuous spectrum and those of the 
discrete spectrum: 


00 w rx+AX 

— yn(x) dx y\*{x) d\ = 0. (47) 


We can now extend the expansion of an arbitrary function in a 
series of eigenfunctions to the case of an infinite interval. How¬ 
ever, in correspondence to continuous eigenvalues, we shall have an 
integral rather than a series. Referring, for greater generality, to 
the case in which both discrete and continuous eigenvalues exist, we 
have an expansion of the form® 


fix) = 2 fnVnix) + f fxyxix) dX, (48) 

n«l 

® The limit a; —► ± « refers to the case where the interval is infinite on both 
ends; otherwise we are to read only + <», or — 00 , and to substitute the ordinary 
condition y = 0 for the other endpoint. The integrations with respect to x are 
understood to be extended over the entire interval, 

* Extending the concept of integral in the sense of Stieltjes, we could write 
this formula, like all analogous ones, with a single integral, making it also 
contain possible discrete terms (see, ifor example, page 123 of No. 14 of the 
Bibliography). 
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where /x is a function of the continuous variable X, and the integra¬ 
tion with respect to X is understood to be carried out over the whole 
continuous spectrum of eigenvalues. The determination of the 
coefficients fn is made as in §9, with theorem (47) used in addition to 
the orthogonality of the ?/«, and one finds the same expression (32). 
The function f\ (which appears under the integral sign in a manner 
analogous to the /« under the summation sign) is similarly deter¬ 
mined in the following manner. Let us first of all decompose the 
continuous spectrum into intervals AiX, A 2 X, . . . and let us break 
up the integral as follows: 


We now multiply both sides of the equation by 


ArX Jxr 


yx*{x) d\ 


and integrate with respect to x (over the whole range of x). Then, 
letting the intervals ArX approach zero, we see that by virtue of 
(47), all terms of the first sum on the right-hand side vanish, and 
according to (46) all terms of the second sum vanish except the rth 
term, which reduces to/x, according to (46'). Omitting the index r, 
since X varies continuously, we have 


1 r« rx-fAx 

^ aI / dK (49) 

AX-^0 7 « y X 

and if/(x) becomes infinitesimal of sufficiently high order at infinity, 
we can interchange the limit with the integral sign and obtain 

./x = f(x)ys*{x) dx. (50) 

The completeness relation (in the case of the continuous spectrum) 
is 

/_“J/I'rfx= /|/xl=“dX. (51) 

11. Expansion in a Fourier integral. The most important appli¬ 
cation of the preceding considerations is the expansion in a Fourier 
integral^ which is the natural extension of the Fourier series expan¬ 
sion to the case where it is desired to represent a given (nonperiodic) 
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function f{x) over an infinite interval. This function can be 
obtained by taking expressions (44) as eigenfunctions, and hence 


fix) = 
or else 


Jo 2 


X Jo 2V’r-v/X 

fix) = Cio>) e'“* du, 


£ l (M— ^-.Vx, rfx, (52) 


having set 


C(<o) = ^ Ci(o>2) (for a. > 0), ) 

and C(a)) = — .^-^C 2 (a>*) (for w < 0). | 

The coefficients Ci and C 2 are given by (50), and we find 

C(a.) = ~ j‘ fix) e-^-dx. (64) 

We observe that if f(x) is an even function, that is, if /(—rr) == 
/(x), then C(o)) will also be even, and if f{x) is odd, C(a>) will be odd. 
In the first case, (53) can also be written as 

/(x) = 2 C(w) cos o)X do), (53') 

with (7(aj) = - f f(x)co9ofxdx; (54') 

^ JO 

and in the second case: 


CM = -J—C2M) 


fix) 

= 2j 

^ C(w) cos a)X dw. 

(530 

CM 

_ 1 

T ^ 

1 fM cos ofx dx; 
lo 

(540 

case: 




fix) ■■ 

= 2z J 

^ * C(w) sin a>x d«, 

(53'0 

CM 

_ 2. 
ZT ^ 

j f(x) sin o)X dx. 

(54'0 


Even in the general case, (53) and (54) can easily be put into 
trigonometric form. 

12. Wave interpretation of the Fourier expansion. The expan¬ 
sion in a Fomder integral is susceptible of an expressive physical 
interpretation which may apply to wave phenomena of any kind, 
including elastic waves, light waves, sound waves, and others. 
We shall investigate this interpretation by using the terminology of 
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electromagnetic optics, but the translation of the same considera¬ 
tions into the language corresponding to other types of wave 
phenomena is immediate. 

If monochromatic radiation of wavelength X travels (in plane 
waves) with a velocity V along the x-axis, any one of the components 
f of the electric or magnetic field is represented, as a function of t 
and of X, by the well-known wave equation 


where we set 
and 


/(x,<) = 


x^ix-Vt) 


k 

V 



(''wave number’^) 
('"frequency^’) 


and where A is a generally complex constant whose absolute value 
represents the amplitude of the waves, whereas its argument repre¬ 
sents their phase. We shall call A the complex amplitude [as usual 
the physical quantity / may be understood to be represented by the 
real part, or by the imaginary part, of expression (55)]. 

In any case, / is represented graphically by a sinusoidal curve of 
wavelength X along the x-axis, which progresses in time in the posi¬ 
tive direction of the x-axis (if V > 0) with the uniform velocity V. 
We shall say that / represents a monochromatic wave train}^ We 
then note that a reverse wave train may be represented by the same 
expression, when k is given a negative value. 

Heterochromatic radiation will be represented by a sum of 
terms of the type (55) (with different values of X and hence different 
values of k and v)y each corresponding to a single monochromatic 
component or to a single spectral line. In general, the velocity V 
will also be different for the various radiation components; that is, 
there will be dispersign. We can select any one of these quantities 
X, k, and v to characterize the individual monochromatic com¬ 
ponents. We shall take the ‘'wave number k'^ and shall suppose 
that the function F(fc), governing the dispersion of the medium, is 
known. The frequency will then be given by 

vik) = ^ = kV{k). (56) 

By the word ‘Hrain^’ we designate a sequence of waves unlimited in space 
(from — 00 to -f 00) and hence in time. Instead, we shall use the word ''group ” 
when dealing with a limited number of waves. 
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If the radiation, wlien observed with a spectroscope, producics a 
continuous spectrum rather than a line spectrum, we must represent 
it by an integral rather than a sum, which will be 

f(x, 1) = j A{k) dk. (57) 

The quantity dk represents the luminous intensity correspond¬ 
ing to the interval dk of the continuous spectrum; hence the function 
|.4(/t)|“ represents tlie intensity distrilnition in the spectrum. In 
order to represent, liy a single formula, the superposition of radia¬ 
tion traveling in the forward direction and radiation traveling in the 
opposite direction, we have extended the integral in (57) from — oo 
to +00, with the convention that /!( — Ar) represents the amplitude 
and phase of the waves of wave number k traveling in the opposite 
direction. 

At time zero, the distribution of / along the a:-axis is given by 
fix, 0) = Aik) dk, (58) 


a formula which may be identified with (53), provided that we put 


the following expression for A: 


With these substitutions, (54) yields 


A{k) = ^ /(+ 0) (59^ 

and the completeness relation (51) is written in this case as 

j_\ \fix, 01* dx = f \ M \^dk. . (51') 

Thus, given the initial distribution of /, Fourier^s theorem points 
out how to decompose it into a superposition of infinite mono¬ 
chromatic wave trains whose complex amplitudes are given by (59). 
These wave trains then travel, each with a different velocity F(fc), 
and their superposition, at time t, gives rise to a function/(a*, t) (in 
general represented by a curve of different shape from the initial 
one) expressed by (57). 

In the case where /(x, 0) is an even (or odd) function, formulas 
(53'), (54'), (53"), (54") may be modified to yield 
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for even /: 

f{Xj 0) — 2 ^ A{k) cos 2'jrkx dk, (58') 

A{k) = 2 f(Xj 0) cos "lirk/x dx; (59') 

and for odd /: 

f{Xj 0) = A(/c) sin 2x/cx dk, (58") 

o /* * 

A{k) — -j I f{x, 0) sin 2Trkx dx. (59") 

Example: Group of umvcs of constant amplitude. We shall apply the 
preceding considerations to the case in which the initial distribution of / is 
the one represented by Fig. 19, that is, 

f fix, 0) = sin 2'Kk(^ (for —l<x<l) 

1 fix, 0) = 0 (for X < —I, X > 1). 

We say that we are dealing with a wave group of length 21. We can 
imagine it to be produced approximately by means of a source which is 
capable of emitting monochromatic radiation of wavelength Xo but which 



Fig. 19 

is turned on only during a finite interval of time, namely, that time during 
which the light travels a distance 2L Suppose that we receive such 
radiation in a spectroscope (of infinite resolving power) and want to find 
out the composition of the spectrum. 

Since we are dealing with an odd function, we shall use equations (58") 
and (59"), which yield 


with 


f(x, 0) = 2i A{k) sin 2Trkx dk, 

' 2 

c) = T / sin 2'Kk{}X sin 2Tkx dx 

i Jo 


I sin 27r(A;o — k)l sin 27r(A:o + k)l 
i ^ 2T{kQ — k)l 2w{ko + k)l 


sin u 

We observe that the function - has a maximum (= 1) for w = 0 

u 

and an infinite number of other maxima to the left and right of the former, 
these other maxima decreasing rapidly as they lie farther and farther away 
from it. Hence A(k) will be appreciable only for the values of k near ko. 
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and for these the second term will be negligible compared with the firstd^ 
The intensity distribution over the spectrum will thus be given approxi¬ 
mately hv 

and is graphically represented by the curve of Fig. 20. Neglecting the 
secondary maxima, we can say that the spectrum is composed of a line 
which corresponds to the wave number ka but which is broadened: the 
width of the line [provisionally defined as the distance between the two 



minima A and B corresponding to 27r(A;o -- k)l — ±7r] is obviously given 
by 8k — 1/L If the source were to emit light indefinitely, that is, if 
instead of a limited group of waves we had an unlimited train {I — «), 
the spectral analysis would yield an infinitely sharp line, as is quite evident. 

Therefore we can say that if a monochromatic wave train is cut short, it 
loses its monochromaticUy, and the spectral line corresponding to it takes on a 
loidth inversely proportional to the total length of the wave group {or to the 
duration of emission). 

Hence, the light is no longer strictly monochromatic if it is not emitted 
over an infinite time interval. 

13. Width of a wave group and width of the corresponding 
spectral line. The observation at the end of the preceding section 
is of fundamental importance. It may also be extended to wave 
groups whose amplitude is not constant. (We shall say that we 
have a group whenever / has appreciable values only within a 
limited region of the x-axis and is zero or negligible over the remain¬ 
der.) For this extension, however, we must first give precise and 
general definitions of the width of a spectral line and the width of a 
wave group. 

Since we are supposed to observe only the 'Torward "-moving waves, we 
shall consider only positive values of k. 
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It is natural, meanwhile, to define as center of the spectral line 
the center of gravity^' of the intensity, that is, the value k of k 
defined b 

^ = Wdk, (61) 

where we have put (total intensity of the spectrum) 

/ = j'^lAl^dk. (62) 

The half-width Ak of the line will be defined by the following 
formula (analogous to the one defining the '^radius of gyration'’ in 
mechanics, or the root-mean-square error in error theory): 

(Ak)‘‘ = jj {k-k)%i\^dk. (63) 

In perfectly analogous fashion, we shall define as center of the 
wave group the point x given by 

^ ^ J J_ ( 64 ) 

where the integral / = \jV dx (62') 

is the same as (62), by virtue of (51') of §12; and as half-vndth^^ of 
the group we define Ax given by 

(Az)^ = j dx. (65) 

After these preliminaries, it can be shown that the shorter the 
wave group, the wider will be the spectral line corresponding to it; 
and more precisely, that Ax and Ak are related by the inequality (of 
considerable importance in wave mechanics) 

Afc Ax > ^, (66) 

which is derived from Fourier's theorem and is independent of the 
physical significance of the quantities involved. 

If this definition were applied, for instance, to the wave group of Fig. 19, 
it would yield (approximately) Ax »* lly/Z) hence the ^Vidth” defined as 2Ax 
would be 1.155L 
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In order to demonstrate this proposition, let us introduce the 
function 

F{x) = fix, 0) e-2'*** (67) 

and let us start from the obvious statement that the quantity 

D = F + — 

2iAxy ^ dx 

is always > 0. Expanding this expression, we have 


2 ) = pp* 4 - 5 _£ (p 4 - p* 

4{Ax)< ^2(Ax)'^V dx^ 

ix X) nr.mc I 1 d r/ —\ T7t, 


dF dF* 
dx) dx dx 
FF* 


"• + 2(EF S ^ 




Let us multiply through by dx and integrate from — oo to + , 

noting that FF* = ff* = |/|2, and taking (65) and (62') into account. 
We obtain 




/ 

2 (Ax)® 




If F is infinitesimal of sufficient order for a: = ± oo, the second and 
fourth terms will disappear, and there remains 


I 

4 (Ax) 2 


F*^dx>0. 

dx^ 


This integral may be related to A/c as follows. From (63), (59), 
and (67) we have 


= j J (& - liYAA* 


{k - HYA dk / /*(x, 0) e®^ dx 




kYAdk / F*(x) dx; 
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by reversing the order of integration, we obtain 

(Aky = j J dx j {k- kyA dk. (69) 

The integral with respect to k may be obtained by noting that 
(67) may be written, because of (58), 

F{x) = j A dk, 

from which, upon differentiating twice, we get 

(fc - kyA dk. 


dx^ 


/- . 


Substituting into (69) the expression found in this way by inte¬ 
grating over /c, this becomes 


(Aky 


- -A- f “ 7^’ 


d-F 

* ZJL fjx 

dx^ • 


Thus we have obtained the integral that occurs in (68), which 
therefore becomes 


from which, since 7 > 0, (66) may be obtained. 

The product Ak Ax takes on the minimum value l/47r when 
7) = 0, that is, when 

^ Z-5.. /T' -f ^ = 0 

2(A.t)2 ^ dx 


from which, by integration, 


F^Ce 


or else /(x, 0) = (7 e 

This function is represented by a curve of sinusoidal behavior, 
enveloped by the curve 


y = C€ 


( 71 ) 
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which is the well-known Gaussian error curve with its center at 
X = x. 

This special form of wave group has the peculiarity that A {k) is 
represented by an expression analogous to /. In fact, upon using 
(59), we find 

A{k) = C' e 4(Afc)» (72) 

where 


C' = 


C 


2 \/^ Afc 


p2TtXs 


(72') 


Another theorem^* on the Fourier integral which it is useful to 
remember concerns the ^^idth^^ of the wave group and the ‘Svidth^^ 
of the line—defined, however, not in the above fashion but as the 
smallest interval, which we shall call 2A'x (or 2A'A:), containing all 
the values of x (or of k respectively) for which / (or A respectively) 
is different from zero. For the case shown in Figs. 19 and 20, we 
would have A'x = Z, A'fc = oo. The theorem under discussion is 
the following: A'x and A'k can never both be finite. This statement 
implies, in particular, that if a light source emits light for a finite 
time, the emitted light will not only be heterochromatic but also 
its spectrum will (theoretically) be unbounded. 

14. Group velocity. Let us now consider an almost mono¬ 
chromatic wave group—one such that only frequencies comprised 
in a small interval appear with appreciable intensities in the Fourier 
expansion; in other words, A{k) may be considered different from 
zero only for A; lying within a very small interval (fco — €, /:o + c). 
We can then introduce, within such an interval, a variable >?, which 
always takes on very small values, if we put fc = fco + ^; and hence 

V - Vo + K'n 

where vo = v(fco), K = 

With this substitution the expansion of / may be written [see 
(57)]: 

fix, t) = f^^Aiko + v) dv. (73) 

J. von Neumann, J^eits, /. Phyaik 57, 31 (1929). 


dA 

dk/k-^ko 
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Upon comparing this formula with (55) which represents strictly 
monochromatic waves, we observe that the integral occurring in (73) 
takes the place of the amplitude A of (55). This integral, however, is 
not a constant but a function of x and t. Therefore it can be stated 
that equation (73) represents waves of frequency vq and wavelength 
l/Zco, but of variable amplitude (and phase). If for a given t we 
represent the absolute value of the integral graphically as a function 
of X (dotted curve in Fig. 21, which we shall call ^^envclope’’ of the 
group), / will be represented by a type of sinusoidal curve of variable 
amplitude inscribed within that curve. 



Fis. 21 


Then we note that the integral contains x and t only in the 
combination x — v^t) consequently, the envelope of the group dis¬ 
places itself without deformation, with velocity Therefore we 
can say that the entire wave group advances with velocity vq^ which 
is called group velocity for that reason, and will be designated by v. 
Thus V will be (implying k = ko) 


or else 


dv 


dk' 

1 ^ d(l/X) ^ 


V dv 


(74) 

(74') 


It is to be noted that the group velocity v is in general different 
from the velocity V = v/k with which the individual waves travel 
{phase velocity); in fact, 
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and hence v = V(k(^ + A^oF'(^o). 


(75) 


Only when V' = 0—that is, when there is no dispersion—do we have 
= F(A:o). 


In order to understand how a wave group may displace itself as a 
whole, with a velocity different from that of the individual waves, we 
should observe the waves produced by a pebble on the surface of water. 
While the group projiagates, the outermost waves disappear gradually, 
while others apjK^ar in the interior, so that each of them progresses gradually 
toward the front of the group, the number of waves within the group 
j-emaining constant. The velocity of each wave is therefore larger than 
the velocity with which the group as a whole travels. This elfect is due 
to the fact that water waves travel with a velocity that is the greater the 
lower the frequency (b'(/c) < 0). 


16. Waves in space; wave packets. The statements made in the 
preceding sections may be easily extended to the propagation of 
waves in three dimensions. A plane monochromatic wave train can 
be defined by the wavelength X, by tlie direction cosines a, /?, y of 
the direction of propagation (normal to the wave front), and by the 
complex amplitude A, The equation which describes this wave is, 
in analogy to (55), from which it may be obtained by a change of 
axes, 


/(^, y, 0 


. ~{ctx-\-fiv-\ryz — Vt) 

A 


(70) 


or else, upon introduction of the wave number fc == 1/X instead of X, 
and putting 

Jcx “ A'Qi, ky ~ k(3j kg ~ kyy ^77) 

J{Xy y, Zy t) = A ( 73 ) 

It is convenient to consider kxy kyy kg as components of a vector 
k {'propagation vector) that represents by its absolute value k the 
wave number 1 /X, and by its direction the direction of propagation. 
Indicating by r the vector with components Xy Py z which fix the 
point where / is being considered, one can write (78) in invariant 


It is useful to remember from now on that in the case of electromagnetic 
waves this vector is closely related to the momentum p of the photons (see §3). 
In fact, p has the same direction as k and the magnitude p = X/X ^ hk\ 
therefore 


1 
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form, that is, in a form independent of the axes: 

/(r, i) = A ( 7 g') 

Superimposing such infinite wave trains of all possible wave 
numbers and directions, w e obtain an / represented by 

oo 

f{x,t) = // / (79) 

— 00 

where v is in general a function of kyy kz’y or of k only, if the 
medium is isotropic, as w^e shall assume. 

For a given value of t (for example, i = 0), this formula may be 
considered an expansion in Fourier integrals of a given function of 
X, y, z —an expansion that may be obtained by successively apply¬ 
ing the formula of §11 three times. We can see that the coefficient 
Ay the amplitude of the individual waive trains,^^ is given by the 
following formula, which is the generalization of (59): 


4 (k) = j j j /(r, 0) dx dy dz, (80) 

— 00 

When / has appreciable values only wuthin a limited region of 
space and is zero or negligible outside this region, we shall say that 
it represents a wave packet (w'hich is the three-dimensional analogue 
of the ^Svave group,’’ defined in §13). A particular waive packet 
may, for instance, be physically realized in the following manner. 
Consider a projector capable of emitting a cylindrical pencil of 
light, wdiich is turned on for a very short time interval, during which 
the light covers a distan(^e 21, There will then originate from the 
projector a cylindrical section ” of light that constitutes a luminous 
wave packet.^® 

More exactly, A dkr dky dkz represents the (complex) amplitude of the 
wave trains having propagation vectors lying between {kx, ky, kz), and {kx -f- dkx, 

ky “h dkyj kz "f” dkz). 

It is important to note that the wave packet does not rigorously maintain 
its shape and size unchanged but has a tendency to spread as it propagates. 
This tendency is caused by diffraction phenomena arising from the diaphragms 
which limit the light beam, and in addition, if the medium of propagation is 
not a vacuum, from a possible ^‘dispersion” of the medium. However, within 
the limits of approximation where geometrical optics holds, we can often ignore 
this spreading. In this case the packet (which may here be considered to be 
a point) will evidently describe the path of a “ray,” according to the laws of 
geometrical optics. 
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As center of the packet we define the center of gravity of |/|^, 
that is, the point whose coordinates x, y, 2 are given by 


00 00 

-5/// x|/|“ dx dy dz (with I — /// I/I 2 dx dy dz)y 


(64') 


and two other analogous formulas. This packet travels with a 
velocity equal to the group velocity^ which has already been defined 
for the one-dimensional case. Similarly one may define, by 
formulas analogous to (65), the three half-widths Ax, Ay, Az that 
give an indication of the extent of the packet along the three axes. 

By means of the Fourier expansion, a wave packet may be con¬ 
sidered to be obtained by the superposition of infinite monochro¬ 
matic wave trains, of different propagation vectors, which interfere 
destructively everywhere except in the region occupied by the 
packet. 

In analogy to (61), we can introduce an average propagation 
vector k defined by 


£ = 


00 



^ Jcx dkx dky dkg 


(81) 


and by two other analogous formulas; furthermore, we define the 
quantities Afc*, Afcy, A/c^, in analogy to (63), by 


<“•>■-)/// 

— 00 


{kx — Sa:)^lA|^ dkx dky 


(82) 


and so on. In (82) we may apply the reasoning of §13 with respect 
to each of the three variables, finding the inequalities 


Ax Akx > ^ 

47r 

Ay Afc, > ^ 
Az Afc. > :)-• 

4'7r 


(83) 


These inequalities tell us that the smaller the extent of the packet in 
space, the wider must be the distribution of propagation vectors of the 
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wave trains composing it. In practice, the subject of optics deals 
fairly often with packets of quite large dimensions (compared with 
the mean wavelength), which may thus be constructed from wave 
trains of almost a single frequency and direction of propagation. 

In a similar way, considering/(r, t) as a function of t only (that 
is, fixing our attention upon a definite point of space) and expanding 
it in Fourier integrals, we would find the relation 

Av At > (84) 


where Av is defined in a manner analogous to Afc*, and so on, and At 
analogous to Ax, and so on. Therefore, when dealing with a limited 
wave packet, we can say that A^ measures the half-duration of its 
travel past a fixed point of space, and Ai^ the half-width of the corre¬ 
sponding spectral line, on the frequency scale. In this way we 
find, from a different point of view, that light of limited duration 
cannot be monochromatic, and that the shorter the duration, the 
larger is the width of the spectral line. 

16. Note on equations with singularities. In the preceding sec¬ 
tions we have always assumed that the coefficients of the differential 
equation are regular over the entire interval considered, including 
the end points. In many problems of wave mechanics, however, 
equations occur in which some of the coefficients of the equation 
[written in form (!')] become infinite at some point of the interval, 
usually at an endpoint {singular points). Therefore a brief review 
will be given here of some of the facts concerning the singularities of 
linear (homogeneous, second-order) differential equations whose 
coefficients we shall suppose to be single-valued functions of x.^^ 

At a point xo, the equation (!') is said to possess a Fuchsian (also 
called nonessential) singularity, if for x = xo at least one of the coeffi¬ 
cients P, Q becomes infinite—not to higher than the first order for 
P, however, and not to higher than the second order for Q —so that 
the equation may be written 


2 /" + 


(P(x ~ Xo) , Q(x - Xo) 
X — Xfi ^ (x — Xo)^ 


2 / = 0 


(85) 


where (P{x — xo) and Q(x — xo) represent ordinary power series 
in X — Xo (with integral, nonnegative exponents). If instead for 
For greater detail see, for example, No. 34 of the Bibliography. 
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X = Xoy the coeflScient P becomes infinite to an order higher than 
the first, or Q to a higher order than the second, the singularity is 
said to be of the non-Fuchsian or essential type. 

The reason for this distinction lies in the behavior of the solu¬ 
tions in the neighborhood of the point x = x^. In the case of the 
Fuchsian singularity any integral of the equation is either regular 
Sit X — Xo or at most possesses a singularity that permits us to write 
the solution 

y = (x — - o^o), (80) 

a being real (in general not an integer) and 4> being a regular func¬ 
tion (or, in an exceptional case which we shall mention below, a 
function having a logarithmic singularity). On the other hand, if 
the coefficients at a; = a:o do not satisfy the aforesaid condition, 
there may exist solutions with singularities of different types, or 
solutions that become altogether indeterminate at a; == .To. 

Excluding from our considerations the non-Fuchsian singulari¬ 
ties, let us start from equation (85) and try to satisfy it by a solution 
of the form (86), taking for ^ a regular function, that is, a series of 
integral, positive powers in t — To, which we may suppose for 

X = To. 

Substituting expression (86) into equation (85) and setting the 
coefficients of the various powers of t — To equal to zero, we may 
determine formally the coefficients of the series The first of 
the conditions obtained in this manner serves to determine a; it is 
easily found to be 

a{a — 1)4- a(?(0) + Q(0) = 0. (87) 

This is a second-degree equation that in general yields two roots, 
a I and a 2 , to which there correspond, in general^ two integrals of the 
form (86) which are independent (as can be shown) and which thus 
can be taken as fundamental integrals. The radius of convergence 
of this series extends to the nearest singular point. If, however, 
the two roots of (87) differ by an integer (or, in particular, if they 
coincide), then only one of them (the one whose real part is greater) 
will furnish a solution of the type (86); the other one gives rise to 
difficulties in the formation of successive coefficients of In that 
case, instead of a series of integral positive powers, we must take 
for ^ an expression of the type + ^2 log (t — To), where and 
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^2 are regular functions. However, we shall not insist upon this 
exceptional case but shall limit our remarks to stating that in all 
cases we may find at least one integral of the form (86) by means of 
(87). Equation (87) is called the indicial equation pertaining to the 
singularity a:o. 

Finally, we shall point out the case in which the singularity lies 
at infinity. This case may be reduced, as is well known, to the 
previous one by the transformation x = 1/f. In this way we can 
easily find that the necessary and sufficient condition for the point 
at infinity not to be a singularity of the non-Fuchsian type is as 
follows; for a: —> oo the coefficient P must be infinitesimal at least of 
the first order, and the coefficient Q at least of the second order; that 
is, (for sufficientl}'^ large x) 

( 88 ) 

where (P and Q are positive integral power series in l/x. 

17. Note on partial differential equations. The equations of 
interest in wave mechanics are, in most cases, linear, homogeneous 
partial differential equations. Many of the considerations for the 
case of a single variable may be extended to such cases. We shall 
mention each of these extensions briefly, referring here, for con¬ 
venience of notation, to the case of two independent variables, x 
and y (the extension to three or more variables is immediate). 

The equation will be said to be self-adjoint if it has the following 
form, analogous to (12): 

where P and R are two functions of x and y (which we shall suppose 
to be analytic). In R there often occurs a parameter X, just as in 
(14); that is, the equation is 

l i) + h S) + 

An important particular case is that in which P = 1, so that the 
first two terms form the Laplacian A?i. 

The eigenfunction problem, stated for the case of two variables, 
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is the following: Given a region S of the plane limited by a contour <r, 
to determine a u(x, y) which within S satisfies (89) and which vanishes 
on the contour, or has a vanishing normal derivative. 

This problem does not in general possess regular nonzero solu¬ 
tions, except for certain values \i of the parameter X, which are 
called eigenvalues. These eigenvalues form (if the region S is of 
finite extension) a denumerable infinity. To each of them there 
correspond one or more independent eigenfunctions Ui(x, y) ; if more 
than one, the eigenvalue is called multiple. Two eigenfunctions 
Wn, Umy corresponding to two different eigenvalues X„, Xm, have the 
orthogonality property 

u„uZ dS ^0 (90) 

(where the integral is extended over the entire region S, and dS == 
dx dy), which is proved in a manner perfectly analogous to the 
method followed in the case of a single variable. 

Similarly we impose upon the eigenfunctions the normalization 
condition: 

^ M„M* dS = 1. (91) 

There are no generally applicable methods for the solution of 
problems involving partial derivatives. One that succeeds most 
often is the method of separation of variables. This consists in 
looking for a solution u{x, y) which is the product of a function X of 
X alone, times a function Y oiy alone. Therefore we must put into 
(89) 

- X{x)Y{y), (92) 

and we must try, if possible, to effect such a reduction of this equa¬ 
tion that we may equate an expression containing functions of x 
only to one containing functions of y only. If this attempt suc¬ 
ceeds, each of the two sides will separately have to be equal to a 
constant (since both x and y must be able to vary independently). 
Thus we obtain two equations involving ordinary derivatives of the 
two functions X and Y. Hence the problem is reduced to that of a 
single variable. 

The separation of variables is effectively possible in the majority 
of the cases in which we are interested here, and we shall see exam- 
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pies later on. It is to be noted that the possibility of separation of 
variables depends on the coordinate system chosen. By effecting 
a change of coordinates (for example, from Cartesian to polar 
coordinates), a separation that was not possible in the original sys¬ 
tem may sometimes be obtained in the new system. In general, 
the system of reference that allows the separation of variables has a 
definite relation to the geometrical properties and to the symmetry 
of the problem. For example, in problems with spherical sym¬ 
metry, polar coordinates permit the separation, and so on.*® 

^®See Eisenhart, Phys. Rev.^ 46, 428 (1934), and also Pauling and Wilson, 
Irvlroduction to Quantum Mechanics, Appendix IV (McGraw-Hill, 1935). 



CHAPTER 6 


Probabilistic Statement of Atomic Problems 

In this chapter we shall show how tlie fundamental problems 
arising from atomic phenomena must be stated, in order not to 
violate the logic,al printuple which Heisenberg placed at the base of 
theoretical physics (see §34). That is, (pianlities which are not at 
least conceptually observable, or relations which are not at least 
conceptually verifiable, must no longer enter into our reasoning in 
an essential capacity. This statement leads, as we shall see, to a 
resolution of the apparent contradictions between the wave and 
particle nature of both matter and radiation. However, since the 
reasoning is perhaps easier to understand in the case of radiation, we 
shall start with the latter, and shall then transfer our considerations 
to the case of matter by analogy. 

18. Analysis of the photon concept. If we think of the different 
means which we generally use to detect radiation (vision, photo¬ 
graphic plate, photoelectric cell, and others) we recognize that they 
detect the radiation only by means of the modifications experienced 
by the material particles absorbing the radiation. These modifica¬ 
tions consist essentially in the acquisition of a certain amount of 
energy {quantum) which manifests itself by chemical actions (on the 
retina or the photographic plate) or thermal actions, or else by the 
emission of electrons, and so on. We shall call such phenomena 
elementary absorption processes. Another, less common, way to 
detect radiation consists in making use of the pressure which it 
exerts, or of the momentum which it imparts to an electron, as in the 
Compton effect. In these cases we make use, not of the energy, 
but of the momentum imparted to matter by radiation. Simi¬ 
larly, we may detect that a material particle has completed an 
elementary emission process either by the fact that its energy has 
decreased without any other cause or by a change in its momentum 
(recoil) not attributable to any other reason. To sum up, radiation 
is detected in all cases through its interaction with material parti- 
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clcs. It is not possible to detect it on its trajectory from one to the 
next particle but only at the ^‘departure’’ or at the ‘^arrival/’ 

For a better understanding of the scope of this impossibility, it 
is useful to discuss several apparent violations. 

A rather trivial one is the well-known fact that when a light 
beam traverses a turbid medium, its path becomes laterally visible. 
But it is easily understood that in this case it is onl}^ the coarseness 
of the means of observation which gives us the illusion of seeing a 
continuous luminous streak. If we observed it with sufficient 
accuracy, however, we should see the individual scattering particles 
in the form of separate brilliant points—actually visible to the 
naked eye in the (;ase of pollen, or with the microscope in the case of 
tiirlud solutions. If, then, the scattc^ring particles are single mole- 
cuh's, they are only practically, not conceptually, invisible. But 
the succession of these brilliant points does not, in fact, outline the 
points along the trajectory of one and the same photon. Instead, 
we are clearly dealing with a situation where some of the quanta 
emitted by the source are scattered laterally, while those which are 
not scattered pass l)y unobserved. If the source emitted a single 
quantum, the quantum would be scattered by a material parti(;le, 
and the observer would see a single bright point; or else it would not 
be scattered, and the observer, standing to one side, would see 
nothing. 

Let us now examine the validity of the following argument. If 
at A there is a light source which undergoes an elementary emission 
process, and if at B there is a particle completing an elementary 
absorption process, it seems possible to assert that the quantum 
emitted by A has traveled along the straight line segment AB in order 
to reach B where it was absorbed. But if we try to find some means 
to check this assertion, we shall soon realize its inconsistency. In 
fact, to lend credence to this assertion, it would be necessary, by 
placing an opaque screen with a small hole between A and B, to 
find out if the particle B receives the quantum onhj if the hole lies 
on the line AB. However, it is well known that a small opening 
gives rise to diffraction phenomena by virtue of which, if the hole 
is on the line ABj it may happen that the particle B does not receive 
the light; conversely, it may happen that B does receive the light 
even though the hole is located outside of the line AB. 

We might suppose that it is legitimate to attribute to the photon 
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the trajectory constituted by the path ACB when, after we have 
placed the hole of the screen at any point C and have repeated the 
experiment many times, it finally happens that particle B receives 
the energy emitted from A. Evidently we may then repeat the 
argument given above for each of the portions AC and CBy and we 
find that the construction of the trajectory with two segments of a 
straight line is unjustified. 

Then we might think that we might achieve the complete deter¬ 
mination of the trajectory by a limiting experiment, either by means 
of a succession of an infinite number of screens each with an infinitely 
small hole, or, equivalently, by constructing between A and B 
an infinitely narrow tube with absorbing walls of any shape, recti¬ 
linear or not. Then whenever it happens that particle B receives 
the quantum, we could maintain that the quantum has followed the 
trajectory defined by the tube. But such a statement would be 
devoid of physical significance, since we have no other way of check¬ 
ing it than to insert, at an intermediate point of the tube, a material 
particle capable of absorbing or scattering the light, which would 
then not reach B any more. This amounts to saying that the 
phenomenon under study would be altered in a profound way by 
the very fact of being observed. 

We see from these considerations that it is possible to define, in 
an operational sense, the emission or absorption of a photon but not 
the trajectory followed by it. Hence it is not permissible to imagine 
the phenomenon as the motion of a point or of a particle, and it is 
not very appropriate to speak (as we have done just now, for the 
sake of simplicity of expression) of a photon's ^‘departure" from or 
“arrival" at an atom. It is true that the elementary emission 
process has some analogies to the ejection of a material projectile 
(recoil and energy loss) and that the elementary absorption process 
has some analogies to the collision of a projectile with the absorbing 
atom. Furthermore (for the simple case where no media lie in 
between), the second phenomenon occurs with a delay with respect 
to the first, equal to the time required by an object to travel with 
velocity c. But the analogy does not extend to the intermediate 
stage of the phenomenon, since the fundamental property of the 
kinematics of a point is missing, namely, the continuity of the 
trajectory. Our mental habit makes us consider this property as 
being inseparably connected with the others, whereas it is logically 
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independent of them. We are led into the error of constructing a 
literally corpuscular model for optical phenomena—to think of 
photons as ordinary bodies extremely reduced in size but gifted 
with ordinary kinematic properties. Thus we commit the same 
error which we should fall heir to if we were to interpret literally the 
well-known analogies between electric and hydrodynamic currents, 
taking electricity as an actual liquid. It is not surprising that a 
literally corpuscular theory leads to wrong results, since a premise 
contrary to experience has been introduced, namely, that of a com¬ 
plete analogy between photons and ordinary bodies. 

It is therefore necessary to attribute to the corpuscular model 
only the value of an analogy limited to certain aspects of phenom¬ 
ena, such as conservation of energy and momentum. If this 
limitation is kept in mind, the corpuscular model is still of great 
help, since it furnishes an expressive language for the description of 
many phenomena by avoiding long circumlocutions. For instance, 
we shall say ‘^a photon arrives'^ on a certain surface element 
instead of, ^^If this surface element were covered with a perfectly 
absorbing substance, an atom of the latter would undergo an 
elementary absorption process.^' Similarly, the expression ^'to 
find a photon in a certain volume element means *Ho fill dS 
with a completely absorbing material and to determine that an 
atom of the latter undergoes an elementary absorption process.'' 

In order to see now under what limitations it is permissible to 
use corpuscular terminology, we shall once again take up the exam¬ 
ple (see §15) of a projector which, upon being turned on for a short 
time, emits a cylinder or ‘Vpacket" of light (which we shall assume 
to be almost monochromatic). If we want to describe the phenom¬ 
enon objectively without having recourse to any model, neither 
wave nor corpuscular, we must say that there is a region S of space 
(displacing itself with velocity c) which is characterized by the 
property that, when an atom finds itself within it, the atom may 
undergo an elementary absorption process. In that case it absorbs 
a certain quantity of energy w characteristic of the radiation. The 
probability that the absorption takes place is proportional (for a 
given type of absorbing atom) to the energy density IT, which we 
assume to be uniform. If we want to translate this situation into 
the corpuscular model, we shall say that the region S is populated 
with photons, each having an energy u;, so that in each unit volume 
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there will be on the average W/w oi them. It is clear that if we 
take an element of volume dS such that W/w dS is small compared 
to unity, this expression does not represent the number of photons 
which are to be found in dS^ but rather the probability of finding 
one photon. All this is perfectly identical with what we could say 
in the case of an aggregate of rubber balls. There is, however, one 
fundamental difference here. In the case of the balls a perfect 
knowledge of all conditions of throw would permit us, at least 
theoretically, to calculate the position of each ball at any instant, 
and hence to know with certainty whether a ball is to be found in 
the element dS at a given instant. Thus the probability concept 
enters into the question only because of the practical impossibility 
of knowing all the elements needed for a precise calculation of the 
trajectory. On the other hand, in the case of light, there is no 
meaning in assigning to each photon a position and a velocity 
w^ithin the aggregate, so that no calculation, even in theory, may 
predict w^hether a photon is to be found w^ithin the element dS at 
a given instant. All we can say is that the probability of finding 
it there, w^hen observing it, is W/w dS. In this case, therefore, the 
concept of probability enters into the question, not because of our 
ignorance of some elements of the problem, but because of the nature 
of the phenomenon itself. Nevertheless, once this important con¬ 
ceptual difference between the tw^o cases has been pointed out, w'e 
may formally ignore it at times, or else use, in connection wath 
photons, the same terminology which w^ould be used for material 
projectiles, at the expense of considering the elements of their 
motion as imperfectly known —or else known wdth an error whose 
law of probability is knowm. Actually, these elements arc physically 
undetermined^ at least in part, or else conceptually unobservable, as 
will be explained more fully below. 

One always has to remember that to say the elements of the 
motion of a photon (position, trajectory, momentum, and so on) 
are not physically determined’^ has a meaning profoundly different 
from saying that these elements, though physically determined, are 
not ^^knowm” but could be known a priori through a conceptually 
possible observation without disturbing the phenomenon. To 
understand the difference, visualize the following example: Between 
a source 0 which emits photons (at time intervals which may be 
large) and a screen s upon which they fall, there is an opaque screen 
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with two holes A and B, Interference fringes will be formed on the 
screen. For each photon it is undetermined whether it passed 
through hole A or B. Any attempt to decide this question, as by 
blocking one of the two paths, would alter the interference effect. 
Let us now suppose an installation which automatically closes one 
of the two holes and then the other in an irregular manner, so that 
the passage of each photon through A or B will depend on an auto¬ 
matic chance law. To an observer who does not sec the mechanism, 
it is unknown whether a given photon has passed through A or 
through B, but contrary to the previous case, it is conceptually 
possible to decide the question without altering the phenomenon. 
The difference lies in the fact that in this case, no interference 
fringes are formed on the screen s. Thus the phenomenon is 
physically different. 

19. Probabilistic meaning of wave optics. We shall now 
examine the significance which must be attributed to the wave 
theory of light, and its relation to experimental facts. It is known 
that all the experiments on image formation, interference, diffrac¬ 
tion, and so on, exactly confirm the predictions of the wave theory. 
These confirmations consist essentially in the following: given the 
shape, position, and nature of the sources and the various media 
(lenses, screens, mirrors) we calculate, by means of the laws of 
optics (or else, more exactly, by means of the wave equation) what 
the intensity of illumination I should be at various points of the 
screen, which in general is the retina or a photographic plate. We 
perform the experiment and find that the measured intensity cor¬ 
responds, point for point, to the function I coming from calculation 
and reproducing the predicted shape of the images, shadows, inter¬ 
ference fringes, and other factors. In view of the already mentioned 
discontinuous nature of absorption phenomena, the measured 
intensity of illumirration, in the case of monochromatic light, to 
which we shall limit ourselves for now, divided by the energy w of a. 
photon, represents the number of photons per unit area per second 
arriving on the screen. The verification therefore consists in 
determining that on an area element do-, in a time dt, there arrive 
I/w da dt photons, where I is given as a function of position, by 
the integration of the wave equations (precisely, I is the projection 
of the Poynting vector upon the normal to the surface). 

It is to be noted, however, that this statement can be verified 
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only if the photons are plentiful (as usually happens in optical 
experiments). If they are scarce or if there is just one, wave optics 
breaks down. If, for instance, the source emits a single quantum 
of energy into a photographic plate, wave optics predicts that the 
energy must be spread over the whole plate with a certain intensity 
I, whereas in reality we shall find all the energy at a single point. 
Indeed, it is just these cases that have suggested the corpuscular 
model. What significance, then, may be attributed to the intensity 
I given in wave optics? Evidently it will measure the ‘^probability 
density” that the photon will arrive at one point of the plate, 
rather than at another. More exactly, if n photons are emitted, 
then each emitted photon has the probability I/nw da dt of arriving 
on the element da in the time dt. 

This probabilistic interpretation of I is valid in every case. If 
the photons are numerous, w^e can be practically certain, by virtue 
of the law of large numbers (or Bernoulli's law) that there will 
actually arrive on do-, in the time dtj a number of photons equal to 
this probability multiplied by n (that is, equal to I/wdadt) 
carrying an energy I da dt. In this way we again find the ordinary 
meaning of I. Thus we see that wave optics is merely a method of 
calculating the probability density of the photon distribution. When 
the photons are very numerous, their effective density of distribu¬ 
tion, because of the law of large numbers, turns out to be propor¬ 
tional to this probability density, which explains the reason for 
the agreement of theory and experiment in all ordinary optical 
phenomena. 

Hence we have found that although it is useless to look for laws 
of motion of photons as they exist for material particles, we have 
nevertheless a means of determining their distribution statistically. 
This method is provided by the laws of wave optics. 

20. The principle of superposition in optics. We shall now 
proceed to the case of radiation which is not monochromatic, for 
example the case of two radiations of frequencies vi and ^ 2 , with 
the possibility of their directions and intensities being different as 
well. Experience shows that the manifestations of these two 
radiations are additive; that is, their effects are superimposed with¬ 
out mutual perturbation. This effect is to be expected in wave 
theory from the linearity of the differential equations which govern 
it. Additivity is preserved even in the probabilistic interpretation: 
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if the waves of frequency vi produce an intensity of illumination Ii 
over a certain area, and those of frequency V 2 an intensity 1 2 , there 
is a probability proportional to h/wi that the element of area dcr 
(assumed to be perfectly absorbing) will absorb a quantum Wi—hvi 
in the time dt^ and an independent probability proportional to 
I 2 /W 2 that it will absorb a quantum W 2 — hv 2 - This independence 
of the probabilities, which may be extended to as many monochro¬ 
matic components of the radiation as desired, is called principle of 
superposition of optics. 

We shall add a few words concerning the case in which the 
momentum of the photons is observed, as in the Compton effect. 
If in wave optics we deal, at a given point, with the passage of plane 
monochromatic waves having a certain propagation vector k, 
then all photons observed will have a momentum p=/ik (see foot¬ 
note, page IIG.) But if, for instance, two plane monochromatic 
wave trains are superimposed at the point under consideration, 
having propagation vectors ki and /c 2 and intensities Ii and 1 2 , then 
the principle of superposition states that there exists a certain 
probability, proportional to of finding a photon with momen¬ 

tum Pi = /iki, and a probability, proportional to li/w 2 ^ of finding a 
photon with momentum p 2 = Ak 2 . We find analogous results for 
the superposition of as many wave trains as we wish, that is, for any 
radiation. 

It will be noted that the conditions described (superposition of 
wave trains of different frequencies and directions) may be intui¬ 
tively imagined in the corpuscular model, when the photons are 
very numerous; it suffices to think of a mixture of particles of differ¬ 
ent energies and momenta. However, if we are dealing with a single 
photon or just a few photons, the situation cannot be represented 
by any classical model. In fact, as long as no experiment is per¬ 
formed to determine the frequency and the momentum of the 
photon by altering them, it is conceptually impossible to foresee 
what result such an observation would yield; we can only assign 
probabilities to the various possible results. Hence the photon 
does not possess a determined energy and momentum up to the 
instant of observation; that is, for energy and momentum an 
uncertainty exists which is perfectly analogous to the uncertainty 
already considered for the position of single photons. Sometimes it 
may be convenient to have recourse to the corpuscular model in 
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this case as well, replacing this indeterminacy by the supposition 
that the energy and momentum of the paiticle are unknown 
(or better, that only the laws of probability governing them are 
known). But it should always be remembered that this terminology 
is inappropriate, as the example of §18 shows. 

21. The uncertainty principle for photons. We shall now investi¬ 
gate the limits of validity of the corpuscuilar model, or else the 
limits within which it is meaningful to speak of the concepts of 
^‘positionand ^^momentum^’ of a photon at a certain instant. 
For this purpose we shall again use the example of the ^Might 
packet’’ S like the one considered in §§15 and 19. Supposing that 
the source has emitted a single photon, we can say with certainty 
that the photon is to be found within the region xS, constituting the 
packet (which may be determined by the laws of wave optics). 
Thus, though it is meaningless to speak of exact coordinates of the 
photon, we may assign limits to them which are the more restricted 
the smaller the packet. 

We shall now be concerned with the momentum of the photon. 
From §15, we recall that the wave packet may be considered to be 
the result of the superposition of an infinite number of mono¬ 
chromatic wave trains of different propagation vectors k, according 
to the decomposition formula (79) of Chapter 5. By the principle 
of superposition, there corresponds to each of these wave trains a 
value p for the momentum of the photon; that is, to the waves of 
propagation vector k there corresponds the possibility of finding 
the photon with a momentum p = hk. Hence there is an uncer¬ 
tainty concerning momentum as well as position. To state the 
matter precisely, if we remember that the amplitude of the wave 
trains having a propagation vector lying between (fcx, Kj 
(fcx + d/Cx, ky + dky, kx + dkx) is A dkx dky dkgy where A{k) is given 
by formula (80), we easily recognize that the probability that the 
momentum of the photon lies between (px, Vvi Vz) (px + dp^y 
Py + dpyj pz + dpz) is proportional to dpx dpy dpz. This proba¬ 
bility may be expressed in terms of Px, Pv, Pz rather than of kx, ky, kz 
by simply inserting p as a function of k into (80), which then 
becomes 

ao 

^ (0 = / / / fit, 0) dx dy dz. 


(800 
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In order to make this matter more explicit, it is convenient to 
introduce a space representative of momenta, in which each vector 
p is represented by a point of coordinates py, p^. We can then 
say that, just as the position of the plmton is uncertain in x, y, z-space 
and its probability density at different points is proportional to the 
energy density W(x, y, z) (and hence to 1/1“), so the probability den¬ 
sity of the momentum in p^, p,,, p^-space is represented by the func¬ 
tion |/1|- (which similarly defines a wave packet in momentum 
space). As a measure of the uncertainty in the coordinates, we 
may take the half-widths Ax, Ap, Aa: of the wave packet, defined as 
in §13. Likewise, the uncertainty in the three momentum com¬ 
ponents will be measured by the quantities 

Ap:c == AA/Ca:, Ap^ = /zA/Cy, Ap^ = /zA/c^, (93) 

where A/c^, A/iV, and A/c^ are defined by (82) and two other analogous 
expressions. 

We now recall the theorem proved in §15, which showed that 
the more restricted the wave packet in x, p, 2 -space, the greater the 
difference between the propagation vectors of the waves of which it 
is composed, or else the wider the packet in p-space. We reach 
the important conclusion that the more exactly the position of a photon 
is determined^ the more uncertain the momentum, and vice versa. More 
precisely, between the uncertainties Ax, Ay, Az of the coordinates 
and Apx, Apy, Apz of the respective momenta the following inequali¬ 
ties hold, which may be deduced immediately from (83) by multi¬ 
plying through by h: 

Ax Ap* > Ay Apy > ^, Az Ap, > ~ (94) 

Now the partial uncertainty mentioned in §18 is precisely estab¬ 
lished, and it must be applied to the corpuscular model in order to 
adapt it to experimental facts. We note that it is not theoretically 
impossible to localize a photon, at a given instant, with as great a 
precision as desired. However, the momentum of the photon, and 
hence its energy and direction of propagation, will then be com¬ 
pletely uncertain. 

Similarly, if we multiply (84) by h and recall that the energy w 
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of a photon is given by hv^ we have 

Aw At > (95) 

which expresses the fact that the more exactly the instant of passage 
of a photon through a given point of space is determinedy the more 
uncertain its energyy and vice versa. 

Equations (94) and (95) express the uncertainty principle for 
photons. As may be seen, they are purely mathematical conse¬ 
quences of the fact, proved by experience, that the distribution 
probability of photons can be calculated from the wave theory. 

In many cases it is not necessary to state precisely the value of 
the right-hand member of (94) and (95), it being sufficient to keep 
in mind that they are of the order of magnitude of /i, and to assume 
for Ax . . . and Apx . . . any measure of the order of magnitude 
of the uncertainty. The conceptual scope of the principle remains 
the same. For this reason the uncertainty principle is often 
expressed by the formulas 

Ax Apx > hy Ay Apy > '^ A, Az Apz > ^hy (94') 
Aw At > ^hy (95') 

where the symbol ^ indicates that we are dealing with orders of 
magnitude. 

In order to illustrate the principle better and to show the nature of the 
uncertainty which we are discussing, let us consider a one-dimensional case, 
limiting ourselves to an estimate of the orders of magnitude. Suppose 
that we have a group of light waves of total length 21, like the group repre¬ 
sented in Fig. 19. In such a case, the determination of the momentum p 
reduces to a measurement of its absolute value p = h/X, that is, to a 
measurement of the wavelength. For this measurement we imagine the 
light to fall upon a grating. The grating will make it possible to measure X 
with an uncertainty 5X whose order of magnitude is given (as is known 
from the theory of the resolving power of gratings) by 

5X X 

- ssz rste/ —, 

X L 

where L is the maximum difference in optical paths used by the grating. 
Evidently, L cannot exceed the length 21 of the wave group, and hence 
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and since we have (ignoring the sign) 6p/p 
sion yields 

. X 

— ^ ^ else 

p 2L 

This relation between the orders of magnitude of 21 and of 8p, which 
could be made more precise by more detailed considerations, simply shows 
that p will come out the more uncertain, the shorter the wave group, that 
is the more exactly the position of the photons is determined at any given 
instant. Of course the argument given with reference to the grating can 
be repeated for any other means of spectral analysis, because it is based 
uf)on the purely analytical fact that the wave group is equivalent in all 
its effects (as shown in §12) to the superposition of wave trains of different 
frequencies. 

The uncertainty principle expressed by (94) and (95) may be 
said to represent the corrections which must be applied to the 
corpuscular model of photons in order to take into account its lack 
of agreement with reality. Similarly, it can be shown that when 
the model of electromagnetic waves is used, a certain uncertainty 
relation must be introduced in the measurement of the fields E 
and H (see Nos. 10 and 10a of the Bibliography) upon which 
depends, for instance, the paradox of the photoelectric effect men¬ 
tioned in §8 of Part I. 

22. The uncertainty principle for material particles. The con¬ 
ceptual foundation of quantum mechanics lies in the fact (shown 
for the first time by Heisenberg) that the uncertainty principle, 
illustrated in the previous section, holds not only for photons but 
also for any material particle,’^ such as electrons, for instance.^ 
This is equivalent to saying that it is conceptually impossible to 
determine the position and momentum of a particle at a given instant, 
with a precision exceeding that allowed by the inequalities (94); just 
as it is conceptually impossible to determine the energy of a particle, 
by means of an experiment lasting a time M, with an uncertainty 
below hw, related to M by (95). 

Of course, this principle does not mean that it is impossible to 

^ From this statement it follows that it is improper to use the term “particle ** 
to designate those entities which have (just like photons) only some of the 
properties which correspond to the ordinary concept of a material particle or 
corpuscle. However, for convenience we shall currently use this term to 
collectively designate electrons, protons, neutrons, and nuclei, pointing out 
once and for all that the term does not have its ordinary meaning. 


= 6X/X, the preceding expres- 
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know the position of the particle to any degree of desired accuracy 
(just as for photons), or to know its momentum (and hence its 
velocity) to any wanted precision; but it asserts that these data 
may not both be known at the same time. More exactly, the 
principle prohibits the simultaneous knowledge, with infinite pre¬ 
cision, of a coordinate and its corresponding component of momen¬ 
tum (or velocity). This uncertainty arises from the inadequacy of 
the particle model (just as in the case of photons), and marks the 
limits of validity of that model. This principle, although valid for 
any body, in practice has important consequences only in connec¬ 
tion with particles of atomic or subatomic dimensions, because with 
ordinary bodies, in view of the smallness of the constant h which 
occurs in the right-hand member of equations (94) and (95), the 
uncertainties Ax, Ap^;, and so on, required by the uncertainty princi¬ 
ple are negligible compared with those much more considerable 
uncertainties caused by accidental errors of measurement or the 
imperfect definition of the reference points. 

The fact that the uncertainty principle must also hold for 
material particles may be logically deduced from the validity of an 
analogous principle for photons and from the laws of conservation 
of energy and momentum. These laws, as has been ascertained 
experimentally, are verified in all phenomena of interaction between 
material particles and photons. In fact, if the uncertainty principle 
did not hold for material particles, we could profit by one of these 
phenomena of interaction to violate the uncertainty principle for 
photons. It would be sufficient to determine the coordinates of 
the particle during the interaction with the photon, and its momen¬ 
tum before and after such an interaction, with an accuracy exceeding 
that corresponding to (94), in order to know, with equal precision, 
the coordinates of the photon and the momentum imparted to it. 
For example, in the one-dimensional case mentioned at the end of 
the preceding section, it would be enough to measure p, not with a 
grating but by causing the photon to be absorbed by an atom 
(whose X has also been determined) and by measuring the momentum 
acquired by the latter. Similarly, given the duration A^ of the 
interaction, if it were possible to measure the energy of the particle 
in a time small compared with A^ and with an uncertainty below 
h/4kTrM, we could, by performing this measurement immediately 
before and after the interaction, determine the energy given up to 
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or taken from the photon with a precision exceeding that allowed 
by (95). 

But to understand the nature of the uncertainty principle, we 
must examine the mechanism by which it is realized in nature, 
which is as follows. When we attempt to determine the coordi¬ 
nate a: of a material particle with an accuracy Aa:, we are forced 
to impart to it a momentum Px whose order of magnitude is h/Ax 
but whose precise value it is conceptually impossible to know. 
Thus if a previous measurement had told us the value of px, such 
knowledge will be partially lost on account of the phenomena taking 
place after the measurement of x\ similarly, a measurement of the 
momentum component pxj with an accuracy Ap^, implies that we 
displace the particle along the x-axis by a conceptually undeter¬ 
mined amount, of order of magnitude h/Apxj so that, if x was known 
before this measurement, this knowledge will become partially 
useless after that instant. Therefore we can say that in order to 
measure a coordinate and its corresponding momentum (or velocity 
component), mutually exclusive processes have to be employed. 
For this reason the two concepts of position and velocity, although 
they have precise physical meanings separately, may not be attrib^ 
uted simultaneously to the same particle in a precise manner. 
Bohr has used the term complementarity^ to designate this particular 
logical relationship between the two concepts. 

We shall now show by means of some examples the physical 
incompatibility of the two types of measurement, referring the 
reader for a more exhaustive discussion to Heisenberg^s book (see 
Nos. 10 and 10a of the Bibliography). 

23. Measurement of the coordinates of a particle. First method. 
In order to determine the position of a particle, we can ideally 
follow the same procedure ordinarily used to examine bodies, 
namely, by illuminating the particle and collecting the scattered 
radiation in an optical instrument (microscope or photographic 
camera). All that hinders the practical performance of this experi¬ 
ment is the very low intensity of the scattered light, a practical 
difficulty that in no way negates the validity of the following 
argument. 

Let us suppose that light (or, more generally, radiation) traveling 

* See No. 21 of the Bibliography. We shall see later on that this relation 
can also be extended to other pairs of physical quantities. 
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along the a^-axis (Fig. 22) strikes the particle whose position we 
wish to determine. We shall then put up, along the 2;-axis, an 

optical instrument, for simplicity 
assumed to be a pinhole camera 
without objective, equipped with 
a hole of center F and radius R, 
(The argument may also be made 
by substituting a lens camera or a 
microscope.) The arrangement 
will lend itself to an approximate 
determination of the two coordi¬ 
nates X and y of the particle at a 
certain instant. In fact, if the 
particle remained fixed, as at P, 
and if the experiment took a suffi¬ 
ciently long time, a diffraction pattern would form on the plate L, 
which we may represent schematically by a small uniform disk of 
center P' and radius 



where b is the distance from the hole to the plate and X is the wave¬ 
length of the scattered radiation. But we know, from the experi¬ 
ments on the Compton effect, that the particle scatters light in 
quanta and that it receives, in each scattering process, a momentum 
which alters its velocity. Hence we must limit our experiment to 
a single scattering process, which will then not yield the whole 
diffraction pattern on the plate but only a single point Q' of the 
pattern. The position of the point P', which is where the center 
of the diffraction pattern would be formed if the particles continued 
to scatter quanta always from the same position, remains undeter¬ 
mined. All we can say about it is that its distance from Q' will 
not exceed r'. Hence, if the line Q'P intersects the xy-plane in a 
point Q, we can say that during the scattering process the particle P 
was located within a circle of center Q and radius 




R 


where a is the distance of F from the xjz-plane. Thus the coordi- 
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nates x and y are determined with an uncertainty Ax, Ly of the 
order of magnitude r: 

Ax = ~ At^ = ^ (96) 

(It may be seen that in order to obtain a fairly precise determination 
of X and t/, it is convenient, other conditions being equal, to select 
radiations of fairly small wavelength, gamma rays for example, 
rather than light. This observation also corresponds to the well- 
known fact that the resolving power of optical instruments is 
inversely proportional to the wavelength of the light used.) 

We shall now be concerned with the velocity, or the momentum, 
with which the particle is left after the experiment. Suppose that 
initially the particle had a momentum with components yl, pj, pj, 
which we shall assume exactly known, in order to introduce the 
most favorable conditions; they may be zero. After the measure¬ 
ment of the coordinates x and y, the momentum px, Pj/, Pz of the 
particle will have been changed by the momentum received from 
the quantum (which we know from the theory of the Compton 
effect, confirmed by experience). Now the quantum had, before 
the scattering, a momentum with components 

0 , 0 . ( 97 ) 

After scattering, if we designate by a, and 7 the angles formed 
with the coordinate axes by the direction in which the quantum is 
scattered, the momentum of the quantum will have components 

h h ^ h 

- cos a, ;r cos COS J. 

AAA 

Hence the momentum of the particle after scattering will be given by 

^ (1 - cos a), \ 

p 2 - I cos jS, \ (98) 

0 h \ 

PI — ^ COS 7. I 

If a, P, and y were known, p«, p„, and p. would be determined. 


P* = 
Pv = 
p. = 
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However, as follows from the considerations of §18, it must bo 
observed that the line along which the quantum was scattered 
cannot be found in any way. We can only say that it will pass 
through the hole of the pinhole camera, and hence that 7 < 70 , 
calling 7 o the angle which the radius R of the hole subtends at P, 
which is given by tan 70 = R/a; or else, since we are dealing with 
small angles, 

sin y < ~ (99) 

Then since, evidently, 

cos- a + cos‘^ 13 = i — cos^ 7 = sin- 7 , 

P“ 

we have cos- tx + cos- ^ 


wiiic^h is the only limitation on a and /?. All w’e may learn from a 
and concerning cos a and cos separate!}^, is that they both must 
be comprised betw^een —Rja and +P/a. It follows from this con¬ 
clusion and from (98) that px may vary within the limits 


and py within the limits 



hence the uncertainty for both is 

Ap, = ___ = (100) 

From these equations and from (96) we obtain, upon multipli¬ 
cation, 

Ax Apx = ^ hj Ay Apy = ^ h, 

in agreement with (94'). With the camera set up differently, 
similar reasoning applies concerning the coordinate z and its corre¬ 
sponding momentum. 

The same result w^ould be found if, in the case of luminous 
particles or radioactive particles, we made use of the radiation 
emitted by the particles rather than of scattered radiation. In 
fact, the emission of a quantum is accompanied by a recoil which 
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imparts to the particle a momentum in a direction opposite to that 
of the emitted quantum. 

Second method. If a parallel beam of parti(jles (cathode rays, 
for example) is projected normally against a screen AB (Fig. 23) 
having a slit of width d, each 
time a particle crosses the slit ^ 
we may say that its coordinate B 

y has been determined (assum- -^ 

ing the ?/-axis to lie at right --^ 

angles to the slit) with an —-^ 

uncertainty 


Ay^r^d. ( 101 ) ^ 

OG 

From experiment (see Chap¬ 
ter 4 of Part I), we may now 

recall that a beam of material particles of momentum p undergoes 
diffractijn phenomena corresponding to a wavelength X = hly. 
Therefore beyond the slit the beam will no longer be parallel but will 
have (in the direction y') an angular width 2 q:o given, according to 
the elementary laws of diffraction, by 


That is, the particle, upon crossing the slit, could be deviated 
through an angle from its original direction, which may go from 
— ao to +ao. Therefore the component Py = J) sin a of its momen¬ 
tum which was originally zero will remain undetermined within the 
limits ±p sin ao, with an uncertainty 

h I 

Apy = ^ p sin ao = (102) 


From this and equation ( 101 ), we now get, in agreement with ( 94 ') 

Ay Apy = hy 

and we may reason similarly for x and z. 

Third method. We know that material particles of large energy, 
such as alpha particles, may become individually visible, either by 
means of the scintillations they produce upon hitting a fluorescent 
screen (spinthariscope), or by being caused to traverse a gas super- 
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saturated with water vapor which condenses, in the form of fog, 
upon the ions produced by the particle along its path (Wilson cloud 
chamber). In both these cases it is not actually the particle in 
question that is localized, but rather the atom which is hit, and is 
ionized or excited as a consequence of the collision. For this 
reason the coordinates of the particle at the instant of collision 
remain undetermined with an uncertainty of the order of the linear 
dimensions of the atom, which we shall indicate by 

Ao:, Ay, Az. (103) 

We shall now see 'what can be said concerning the momentum 
which the atom retains after the collision. It is evidently equal to 
the initial momentum (which we assume to be known) minus the 
momentum imparted to the electron which is expelled or excited. 
Thus, in order to determine the latter momentum, we must take 
the vector difference between the momenta which the atomic 
electron has before and after the collision. The one after may be 
measured to any desired accuracy by either of the methods of the 
following section. However, concerning the momentum which the 
electron had within the atom before being struck, we can only 
say that its components lie within certain limits between which 
they oscillate periodically due to the orbital motion. The ampli¬ 
tude of these oscillations may easily be evaluated in the case of the 
circular Sommerfeld orbits: it is evidently (choosing the x- and 
y-axes in the plane of the orbit) given by 

Apz = Apy — mv; 

and since, by the quantization rule (see §10 of Part I), 



we have Apx = Apy = 

On the other hand, the uncertainty in x and y is given by the 
dimensions of the orbit in this case, namely, 


so that 


Ax — Ay — r, 
Ax Apx ~ Ay Apy = 


nh 
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In the most general case of any orbit, we should find a result 
of the same order of magnitude; that is, in general, 


Arc Apa: = n/iy. At/ Apy == nh. 

Since n in the most favorable case has the value 1, we again find 
the relations (94'). 

24. Measurement of the momentum or the velocity of a particle. 

First method. The most natural procedure of determining the 
velocity of a particle consists in determining its position at two 
instants ti and ^ 2 , separated by a known time interval. However, 
we know that every determination of Xj p, z introduces an uncer¬ 
tainty in px, py, Pzf which is given by (94) under the most favorable 
conditions. Therefore the velocity of the particle after the instant 
t 2 is not equal to that in the interval between ti and t 2 , as a result 
of our measurement. The introduced change remains undeter¬ 
mined, and we know only that its components are of the order of 
magnitude given by 




niAx2 


AVy 


h 

mAy2 


AVz 


h 

mAz2 


(105) 


calling Ax 2 , A 7 / 2 , Az 2 the uncertainties which we introduce in the 
measurement of the coordinates, performed at time ^ 2 . Now, 
taking t 2 — t\ sufficiently large, we can make the influence of these 
errors upon the measurement of the ratios (x 2 Xi)/(t2 — ti), and 

so on, as small as we wish. Hence these ratios may be considered 
to be exactly determinable, but the quantity of interest, namely, 
the velocity after ^ 2 , remains affected by an uncertainty expressed 
by (105), which may be made small only if we are satisfied with 
little accuracy in the measurement of the position at time ^ 2 . 
Relations (105), which express this relationship between the two 
indeterminacies, are just the uncertainty relations (94'). 

We observe that the velocity between the two instants h and <2 
(which may be calculated, as we have said, to any desired accuracy) 
is a quantity devoid of physical interest, because its very definition 
presupposes that in the interval considered the particle does not 
interact with its surroundings. 

Second method. In order to measure the velocity of a particle 
without the necessity of recourse to two successive observations of 
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position, we may make use of the Doppler effect. For example, to 
measure the component Vx we may employ the following scheme. 

We send light of a known frequency v in the direction of the 
x-axis. We then receive the radiation scattered by the particle in a 
spectroscope in the negative ^-direction and determine its fre¬ 
quency v\ We then obtain, from an elementary argument of optics, 


V 


V 



(106) 


where v% represents the velocity along the x-axis before the measure¬ 
ment. However, it is to be kept in mind that the particle receives 
a momentum 2hv/c in the scattering process, and hence that the 
velocity after the measurement (which is the one we are interested 
in) is 


Vx 




hi. 

me 


Since v is supposed to be known, Vx will be determined with the 
same accuracy as from (106). This accuracy depends on the 
precision with which r' is measured; that is, we have 

Avx = Avl = 7 ^ Av'. 

2p 


Ap' cannot lie below 1/A^, as we have seen in §15, where At is 
the duration of illumination.® Hence under the most favorable 
conditions we have 


AVx = 


c 

2p At 


(107) 


On the other hand, we cannot say at what instant of the interval 
At the particle has received the momentum which has altered v% 
to Vx. Therefore, after the measurement there remains with the 
coordinate x of the particle an uncertainty equal to 

Ax = {Vx rj) A^. 

^ me 

* Or better, M is the duration of a coherent wave group. If the illumination 
lasted longer b\it suffered sudden phase changes every now and then, M would 
stand for the interval between two such changes. 
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From this and (107), we obtain 
Arc 

m 

or else Ax Apx = 

which is the first of relations (94'). 

26. Probabilistic statement of the mechanics of a particle. The 

considerations of the preceding sections show that for material 
particles as well as for photons, the literally corpuscular representa¬ 
tion, which is suggested by the name, does not correspond exactly 
to reality but merely represents a model whose limit of validity is, 
so to speak, defined by the uncertainty principle. Therefore we 
must abandon the intuitive concept of continuous motion along a 
certain trajectory; instead we must direct our attention toward the 
spatial distribution of the probability density. The latter will be 
defined, just as for photons, by a function Pipe,, y, z, t), such that 
P dx dy dz represents the probability^ that upon performing an 
observation at the time ty we find the particle in the element of 
volume defined by (x, y, Zy x + dx, y + dyy z + dz). In analogous 
fashion, wx‘ shall introduce a probability density Qiv^y Vyy Vzy t) for 
the components of the velocity of the particle at time t. Hence the 
mechanics of material particles should not serve to determine their 
motiony that is, their position and velocity as a function of time 
(which would be contrary to the Heisenberg principle, since it would 
be equivalent to postulating the existence of phenomena capable of 
exactly defining the position and velocity of a particle at a given 
instant). Instead the mechanics of material particles should play a 
role similar to that which optics plays in relation to photons, 
namely, to determine the probability densities P{xj y, Zy t) and 

* In order to describe precisely the significance which must be attributed 
to the term “probability’^ in wave mechanics, we should imagine a large 
number N of independent identical systems, subject to the same initial 
conditions, upon each of which we perform an observation of the particle 
at time L If in N' systems the measurement yields the considered result, we 
say that its “probability” is N'/N. (In the optical case, we may refer to the 
simultaneous presence of N photons; this is not possible here, because N parti¬ 
cles would interact and alter their respective probability distributions.) Simi¬ 
larly, when we speak of the average value of a quantity, we understand that 
this value is to be measured in the N above-mentioned systems and that the 
average of the results is to be taken. 



146 


WAVE MECHANICS OF A PARTICLE 


ms 


Q(vxy Vyy Vey t). If we then keep in mind that the diffraction phe¬ 
nomena, which constitute some of the most obvious proofs of the 
wave character of the laws of optics, are also verified for material 
particles (as we have seen in Chapter 4 of Part I), we are naturally 
led to inquire whether in the case of material particles the proba¬ 
bility densities might not also be obtained from equations of the 
type used for waves.® This idea may be more exactly stated as 
follows.® 

In the case of optics, the energy density W (and similarly the 
intensity I of illumination of a surface) does not obey a simple 
equation but is obtained from the formula 


W = ~{El + El + El) +£iHl + Hl + HI), 


in terms of the components of the electric field E and the magnetic 
field H, each of which satisfies the wave equation, which for Ex is 


AEx 


n* d’^Ex 
c* dt^ 


(108) 


where n = the index of refraction. Similarly’' in the case of 

® On the limits of the analogy between optics and mechanics, see W. Pauli, 
Zeits. /. Physik 80, 573 ff (1933). 

® It goes without saying that the heuristic approach used here does not 
reproduce the historical development of the theory (for which we refer to 
what was said in Part I), nor does it pretend to give a rigorous justification for 
the latter. 

^ In order to make the analogy more evident formally, it is convenient to 
represent the electromagnetic field by a single complex vector V, defined by 






H. 


We can then see immediately that the expression for W becomes 

w - + I’j'.l* = Ivrl*. 

The vector ^ (at points free of charges and currents) satisfies the equations 

i d'T 

curl w ^ -n —; div V « 0, 

c ot 

which condense the equations of Maxwell and Laplace for E and H. From 
these we find immediately that each complex component of W satisfies the 
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mechanics, rather than to look for a differential equation which P 
satisfies, it is convenient to introduce a certain number N (undeter¬ 
mined for the present) of scalar, generally complex quantities^i, 
^ 2 , . . . from which P may be obtained by the formula 

P = l^ll^ + 1^21^ + • • • + 

each of which satisfies a differential equation of the second order, 
of the wave type. Dirac has shown that = 4 is the minimum 
number of functions \p with which a wave mechanic's in agreement 
with experimental facts and satisfying the principle of relativity 
can be constructed. If, however, we are satisfied here with a wave 
mechanics that is valid with the approximation with which ordinary 
nonrelativistic mechanics holds (or for velocities small compared 
to c), two functions xp will suffice (Pauli theory—see Chapter 14). 
If the spin effects are also negligible (see §25 of Part I), a single func¬ 
tion xp will be sufficient; this is the wave mechanics of Schrodinger, 
with which we shall be concerned until Chapter 14, where the 
relativistic wave mechanics of Dirac with four components xp will 
be developed. We shall further limit ourselves here to the study of 
a single particle. 

We shall thus introduce a complex function ^(x, z, t) which 
Schrodinger called “field scalar^’ and which is now more generally 
called probability aynplitvde, in analogy with the amplitude of light 
waves. The latter has no immediate physical significance but 
serves to determine the probability density P (susceptible of experi¬ 
mental check, as we have seen) by means of the relation 

P = \xp\\ (109) 

We shall then assume that xp satisfies a differential equation 
analogous to (106), that is,® 

A4'=N^^ (1080 

wave equation 

and so on. 

c* at^ 

® This equation holds rigorously only for monochromatic” waves, that is, 
waves of a single frequency. However, the principle of superposition enables 
us to pass quickly to the more general case, as will be seen more clearly in §29. 
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where which represents the reciprocal of the phase velocity and 
is analogous to n/c of (108), will be determined, by a method 
explained later, as a function of a:, y, z. ^ will thus propagate as 
waves (de Broglie waves), which, however, have no physical exist¬ 
ence but merely provide an analytical means of calculating the 
probability density P. 

Finally, we point out that the integral of P extended over all 
space expresses the total probability that the particle is to be found 
anywhere and hence must result as unity. Therefore \p must 
satisfy the condition 

n^\^\-dx, dy dz = l. (110) 

Since \p is generally complex, it may be written in the form 
^ = ^' + where yp' and yp" are two real functions satisfying 
(108) separately, since the coefficients of the latter are real. The 
use of imaginary quantities serves only to simplify the writing, 
permitting the two equations which yp' and yp" satisfy to be collected 
into a single formula. It might be supposed that a wave mechanics 
with a single real yp could be constructed; but this has been proved 
impossible® if we want P always to come out > 0, as it must to 
satisfy (110), and if it is to be obtained in finite terms from yp 
without the intervention of the derivatives of yp with respect to t. 

It remains to be seen how this probabilistic interpretation of 
mechanics is connected with the ordinary concepts of kinematics 
and dynamics, which, as is well known, may in many cases be 
applied successfully (at least within the limits of observational 
errors) to electrons, atoms, and so on. These are the cases in which 
the probability density P is different from zero only in a region so 
restricted that it may be considered to be a point (or, when the 
de Broglie waves constitute a small packet,’^ analogous to the light 
packet already considered several times). The particle may then 
be thought of practically as being localized in the center of the 
packet (or at any point of the packet), and its motion will be 
identified with the motion of that point. Hence every time we 
reason in terms of the classical kinematics of the motion of a 
particle, it is to be understood that we speak of the motion of a 
packet of de Broglie waves which is practically a point. 

» Pauli, Zeits, f. Physik 80, 573 ff (1933). 
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26. Motion of a particle and motion of a wave packet. In order 
to give a definite form to the fundamental equation of wave 
mechanics (108'), the coefficient TV, which we may call the index of 
refraction of space for the de Broglie waves, remains to be deter¬ 
mined. In general it will be a function not only of x, xy, z but also 
of the frequency v of these waves (as in optics for dispersive media). 
To determine iV, we must also take into account two experimental 
facts: 

(I) the fact that, in cases in which the de Broglie waves consti¬ 
tute almost a point packet, the motion of the latter is governed by 
the ordinary laws of point mechanics (see the end of the preceding 
section); 

(II) the relation X = Ifi/'p (given by diffraction experiments) 
between the de Broglie wavelength and the momentum of the 
particles (see §33 of Part I). 

In order to account for condition (I), we shall consider a particle 
of mass m moving in a field of potentiaP^ [/(x, y, 0 ), with total 
energy E, In wave mechanics this particle (supposing that its 
position is known with very great precision) will be represented by 
a ^*de Broglie wave packet^' which is so restricted in extension that 
it may be considered to be a point. The movement of this packet 
between any two points A and B of its trajectory must be identified 
(ignoring diffraction) either by its trajectory or by its velocity, 
with the motion which classical mechanics assigns to a material 
point of mass m and energy in a potential [/ between points A 
and B, In order to obtain this identification, we have at our dis¬ 
posal the coefficient N of (108') as well as the frequency v of the 
de Broglie waves (or better, the mean frequency of the w^ave trains 
forming the packet). 

First we shall proceed to the identification of the two trajecto¬ 
ries. The trajectory of the wave packet is nothing but a ^^ray" 
and is therefore determined by the laws of refraction, which may be 
condensed into the variational principle. In the case of light waves 
this principle takes the name of FermaVs 'principle and may be 
stated as follows: The line a traversed by a light ray between any 

We designate as ^^potentiaP' the potential energy of the particle, whereas 
in rational mechanics ^‘potentiaP’ is usually the same energy with its sign 
reversed. 
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two points A and B is such that, when it is deformed infinitesimally 
(keeping A and B fixed) the time needed by the light to travel 
from A to B does not change, at least to infinitesimals of higher 
order. 

Since the time required by light to cover an infinitesimal dis¬ 
tance ds is equal to N/c ds (N is the index of refraction, and hence 
c/N is the velocity of light at the point in question), the principle 
just stated may be expressed (using the well-known symbol 8 from 
the calculus of variations) by the formula 

bj^Nds = 0. (Ill) 

On the other hand, the trajectory of a material point, according to 
classical mechanics, is determined from the principle of least action^ 
which in the present case may be expressed as follows: The tra¬ 
jectory s is such that when it is deformed by an infinitesimal amount 
(keeping A and B and the energy E fixed) the integral 

VE - U ds (112) 

(twice this integral is called ‘^actiondoes not vary, at least to 
infinitesimals of higher order. This is equivalent to saying that 

5 If VE - U ds = 0. (113) 

If now the trajectory of the wave packet between A and B is to 
coincide with the trajectory which classical mechanics ascribes to 
the point, it is necessary that (111) and (113) be equivalent, for 
which it suffices to take 

N = (114) 

where C is an arbitrary constant (independent of x, y, z). Thus we 
have found the spatial distribution of the index of refraction (it was 
to be expected that the latter is a function of x, z through U only). 
We observe that, since in general the index of refraction of de 
Broglie waves depends on their frequency, C and E must be con¬ 
sidered to be functions of v. 

Let us now also identify the velocity of the two motions point 
for point. The velocity v of the wave packet is not the phase 
velocity l/N but the group velocity (see §14), which is given by 
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(74') and may be written (noting that 1/X = iVj-) as 


1 _ d(Nv) 


or else, introducing (114), 

i = (Cr VE^) = ^ 


dp 


VE - U + 


Cv 


dE 
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(115) 


(116) 


2 \/E - U dp 

On the other hand, the velocity of the point given by classical 
mechanics is (since = E — U) 


V = J^VE - u. 

\m 


(117) 


Equating the two expressions for 1/v, we obtain, after obvious 
reductions, 

dE 




dp 


U) + Cp 


dp 


(118) 


We observe that (118) must be an identity in x, y, z, and that 
these variables enter only through U. The coefficient of U must 
therefore be identically zero; that is, we must have 


diCp) __ 

r!,. 


dp 

or Cp — K, 

K being an arbitrary constant. Then (118) becomes 

dE 


(119) 


= K 


or 


where we have pujt 


dp 
a = 


dp 
a, 

y/2rn 

~ir~' 


( 120 ) 


Upon integrating, we get for E(p) the linear expression 

E = ap + b, (121) 

where a and b are two constants. The expression for C{p) is then 
given by (119), which may be written, by making use of (120), 
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Precisely, the expression for the index of refraction which 
ensures the complete identity of the two motions is 

iv = (123) 

av ^ 

where the expression (121) may be substituted for E. 

The phase velocity of the de Broglie waves is therefore 



or else, upon introducing the momentum p = mv and considering 

(117), 


a _ a 
mv p 


(124) 


27, The Schrodinger equation. The expression for N found in 
the preceding section, when substituted into (108'), furnishes a 
wave mechanics which satisfies the condition that it coincide with 
ordinary mechanics within limits analogous to those of geometrical 
optics, for any constants a and h. But condition (JI) of the previous 
section forces us to fix the value of the constant a. In fact, equa¬ 
tion (124) has very nearly the form of the relationship found experi¬ 
mentally between X and p, but for it to agree numerically we must 
take 

a =h. (125) 

The constant 6, however, remains undetermined, since it does 
not occur in the expression for X but only in the expression for v, 
and the frequency of the de Broglie waves is not accessible to 
experiment. This condition accounts for the fact that the energy E 
is defined to within an arbitrary additive constant.For sim- 


This statement holds only in nonrelativistic mechanics. Taking rela¬ 
tivity into account, we have then a way of fixing E in absolute magnitude also; 
and analogously, in relativistic wave mechanics b is determined (see Chapter 14). 
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E = hVy 


( 121 ') 


that is, identical to the formula which holds for photons. 

If we take (125) into account, the relation (123') between the 
phase velocity V of the de Broglie waves of frequency v and the 
potential U becomes 



hv 1 


(126) 


or also from (121') 

^ _ y _ hv 1 


(126') 


Therefore we can say that for de Broglie waves a force field 
represents the analogue of a medium of nonuniform index of refrac¬ 
tion for light, and furthermore that for the de Broglie waves the 
medium is always dispersive, since the index of refraction depends 
on V. 

If we now introduce the expression (126) for N into the general 
equation (108') which ^ satisfies, we find 


which expresses the law of propagation for ^ and is the fundamental 
equation of wave mechanics. However, (127) presupposes that the 
waves are monochromatic (to use optical terminology), or that ^ is a 
sinusoidal function of time with frequency v, which means physically, 
as (12T) shows, that the energy of the particle has a definite value E 
{stationary states or quantum states). In order to express this fact 
analytically, it is.convenient (according to the established use in 
other branches of physics, such as in electricity) to make use of 
exponential functions with imaginary exponent, and to put^^ 

yp = u{x, t/, z) (128) 

where w is a generally complex function independent of time, whose 
modulus represents the amplitude of the oscillations of yp, and where 
V = E/h, 

Many authors write this formula with a plus sign in the exponent. This 
makes no essential difference, except for a few changes of sign in the formulas 
derived from it, as in the right-hand member of (136). 
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From (128) we obtain 

^ (129) 

g = (130) 

01 “ 

By taking these into account, (127) may be written in the following 
form, which no longer contains derivatives with respect to time, and 
is the one usually adopted (Schrodinger equation): 

M (E - t/)V' = 0; (131) 

or also, upon putting for expression (128), 

Am + (E - U)u = 0. (131') 


This shows that the spatial part u of the function ^ satisfies the 
same equation as yp. Since, on the other hand, determines the 
probability distribution, which is equal to \u[^, in many cases it 
makes no difference whether we use the function u instead of ip 
(provided, however, that we are dealing, as we are assuming here, 
with waves of a single frequency). 

The function conjugate to or 

= w*(x, 2/, Z) 

evidently satisfies the same equation (131) as \p and has the same 
modulus, so that nothing new is learned by considering it. 

It is also to be noted that we are dealing with homogeneous 
equations in (127) or (131); hence, after one solution has been found, 
an infinite number of others may be obtained from it by multiplying 
by an arbitrary constant. However, this indeterminacy is elimi¬ 
nated if we take into account the fact that \p must satisfy condition 
(,110), as has already been stated in §25. This equation may also 
be written (setting dS = dx dy dz as usual) in the form 

dS - 1. (132) 

This is called the normalization condition and has been studied in 
Chapter 1 for the eigenfunctions of differential equations. As we 



PROBABILISTIC STATEMENT OF ATOMIC PROBLEMS 


155 


§ 28 ] 

have seen in §4, it may always be satisfied (provided that the 
integral is convergent, a condition which we shall study later on) 
and determines (to within a factor of the form the arbitrary 
multiplicative constant. The arbitrariness which remains in the 
argument of ^ because of the arbitrary constant B has no influence 
either on the modulus or on the wavelength of which alone have 
physical significance; hence it may be neglected. 

28. Energy levels as eigenvalues of the Schrodinger equation. 
In order to make the problem of the wave mechanics of a particle 
determinate, we must impose some conditions of regularity on 
in addition to the condition of satisfying (in the case of stationary 
states) equation (131) with an appropriate value for the constant E 
(which occurs in the equation as an a priori undetermined constant). 
Specifically, it will first of all be required that ^ and its first deriva¬ 
tives be continuous and single-valued over all space. Furthermore, 
in order for us to be able to apply the normalization condition (132), 
the integral of extended over all space^^ must be convergent; 
(this condition will be assured if ^ is infinitesimal to suflSciently high 
order at infinity. If then the potential J7(x, y, z) has points of 
singularity, singularities for ^ may also occur (see §1G). In this 
case we shall require that these singularities be at most poles of 
order lower than the first. 

With these conditions, the problem of the integration of the 
Schrodinger equation enters into the category of those studied in 
the mathematical introduction. As has been pointed out, this 
equation has solutions only if the parameter E (which corresponds 
to X of Chapter 5) has one of the values which we have termed 
eigenvalues of the differential equation. Thus the rather natural 
conditions of regularity imposed upon xp lead automatically to the 
existence of energy levels (discrete or sometimes continuous) which 
is an experimental fact, as we have seen. The determination of 
these levels reduces to the mathematical problem of looking for the 
eigenvalues of the Schrodinger equation. Herein lies one of the 
most remarkable results of wave mechanics, as has been pointed out 
in Part I. 

In certain cases the conditions of the problem force the particle to stay 
within a certain region S. Then, of course, we may integrate the equation 
only within that region, with the condition that V' 0 outside; (see §38, for 
example). 

For a justification of these conditions well as for a statement of the 
conditions which ^ must satisfy in more general problems than these, see 
von Neumann, Opting. Nackr.^ 1 (1927); also Pauli, No. 14 of the Bibliography, 
page 121. 
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28a. The eigenfunctions of the Schrodinger equation. To every 
eigenvalue En (where may in general represent a group of several 
indices) there will correspond one or more normalized eigenfunctions 
which, naturally, depend upon the time according to the law 

== UnC . (128') 

If the eigenvalue is simple, the eigenfunction corresponding to it 
immediately gives, by the square of its modulus, the probability 
distribution P, If the eigenvalue is multiple, this distribution will 
not be completely determined by the value of the energy: this is the 
case of degeneracyj of which we shall see important examples in what 
is to follow. 

We shall now prove an important property of the eigenfunctions 
of the Schrodinger equation, namely: if rpn and xj/k are two eigen¬ 
functions belonging to two distinct eigenvalues En and they are 
‘^orthogonaP’ to each other, which amounts to saying that 

ds = 0, (a) 

where the integral is understood to be extended over all space. 
This theorem and terminology are the natural extension of what has 
been said in §5 for the one-dimensional case. 

In order to prove (a), we shall first of all write that xpn and sat¬ 
isfy the Schrodinger equation for E = En and E — Ek, respectively: 


^ (S„ - U)in = 0, ' 


+ 


8jr*»n 

nr 


(Ek — = 0 . 




The second equation, when i is changed into —i, may be written 

(r) 


Mi +^{Ek- U)i>t = 0 . 


A’- 


Let us now multiply (/ 3 ) by dS and ( 7 ) by dS, subtract one 
from the other, and integrate over any volume S'. We obtain 


L 


^ A^?) dS + ^ (£„ - Ek) 


Mk dS = 0. 
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Let us transform the first integral by means of Green’s theorem 
into an integral extended over the surface a boimding S' (whose 
direction-normal pointing into S' we shall call v). We then get 




d(T 




(Er. - EO 


L 


dS' 


= 0 . 


Let us now make the surface a tend toward infinity. By virtue of 
the hypotheses made in §28 concerning the tichavior of the eigen¬ 
functions at infinity, the integral extended over a approaches zero, 
whereas the integral extended over S' has for limit the integral 
occurring in (a). Hence there remains 


(En - E,)!inypt dS = 0, 

and since En 9^ Ek has been assumed, relation {a) is proved. 

29. The principle of superposition of wave mechanics. It is 
natural to assume that a principle of superposition analogous to 
the one in optics holds for de Broglie waves. Let us therefore con¬ 
sider two different (normalized and independent) solutions i/'i, ^2 
of the Schrodinger equation, corresponding in generaB® to two 
different values Ei and E 2 of the energy and to two different 
momenta pi and p 2 . For simplicity we shall limit ourselves to the 
case of only two components, but the extension to any number is 
immediate. We now form a linear combination (with two constant 
coefficients Ci and C 2 ) 

+ 0-2^2 = CiVi e + C2U2 c , 


to which we shall apply the normalization condition. This becomes 
\ci\^ + |c 2 l^ = 1, as may be verified immediately if it is remembered 
that and ^2 are orthonormal. What will be the significance of 
this It is evident that this case is analogous to the optical case 
(§20) of a photon with undetermined energy and momentum, and 
that therefore ici^il^ dS represents the probability that the particle 
will be found in the volume element dS with energy Ei and momen¬ 
tum pi, that is, with the characteristics corresponding to the 

See §9 of No. 32a of the Bibliography. 

In cases of degeneracy the two energies PJi and E 2 may be equal. In 
that case too, however, we shall require ^1 and ^2 to be orthogonal (see §6). 
This case corresponds, in optics, to the superposition of waves of the same 
frequency but of different directions. 
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waves, whereas |c2^2|^ dS represents the probability of finding it in 
dS with the characteristics of the ^2 waves. It is to be noted that 
these two expressions represent the actual probability and not 
quantities proportional to it, as may be immediately verified by 
observing that the total probability of finding the particle anywhere 
nnd with any energy turns out to be unity (as it should). In fact, 

SM^dS + JM^dS = Icil^ + |c2|2 = 1. 

It is now evident that \ci\^ represents the probability that the 
particle has an energy Ei and a momentum pi, and |c2l^ the prob¬ 
ability that it has energy E 2 and momentum p2. 

Let us now consider the most general ^(x, y, z/t) possible. 
Once t has been fixed at a certain value, < = 0 for example, ^ may 
be expanded in a series by means of the eigenfunctions of the 
Schrodinger equation (131'), which, as we know, form a complete 
orthogonal set; hence 

y, 2. 0) = ^ c„Un{x, y, z), 

n 

with Cn == J^(x, y, z, 0)ul dS. 

Now each of the Un represents the distribution of ^ for ^ = 0, in a 
^^stationary state''; this ^ then evolves in time according to (128'). 
These various components, as we have said, always superimpose 
themselves without affecting each other, and hence ^ at any later 
time t will be given by 

i/{x, y, z,t)=2^ CnUnix, y, z) c '• ". (133) 

n 

This ^ represents a state^’^ of the particle in which the energy 
and momentum are not determined: the probability that the energy 
has a value En and the momentum a value corresponding to the 
waves \f/n is \cn\^, because of what has been said above. 

Equation (133) has been written for the case of discrete eigen¬ 
values. However, if the eigenvalues of (131') constitute a con¬ 
tinuous spectrum from E = Ea to J? = Eh, the series will be replaced 

The form of the function ^ which is assigned to each case (or ''state” of 
the system) depends on the initial conditions (as we shall see better in Chapter 
11) and in particular on the observations to which the system has been sub¬ 
jected initially. 



§30] PROBABILISTIC STATEMENTS OF ATOMIC PROBLEM 159 


by an integral; 

* = (1330 


where = Ue{x, y, z) e ^ , and ue satisfies the Schrodinger equa¬ 

tion 


Aue ~t“ 


STT^m 


{E — U)ue = 0. 


In this case, \c{E)\‘^ represents the ‘^probability density'^ in the 
continuous energy spectrum; that is, \c{E)\^ dE is the probability 
that the energy lies between E and E + dE. 

The most general case is that in which there are both discrete 
and continuous eigenvalues, where ^ will be the sura of a series and 
an integral. Nevertheless, we may consider (133') to be the 
most general form of if the integral is understood to be defined in 
the sense of Stieltjes (see footnote on page 104). 

30. The time-dependent Schrodinger equation. It is to be kept 
in mind that \f/ in the general form (133) or (133')—that is, the 
^‘heterochromatic'^ form—does not satisfy the Schrodinger equa¬ 
tion (131), because each of its terms satisfies an equation of the 
form (131) with a different value of E. However, we can 
easily find an equation independent of E which is satisfied by all 
monochromatic components, and therefore also by the ^ which 
results from their superposition. 

We recall that the component is of the form 

, . ,, -—B.t 

4'n = u„{x, y,z)e * 

and therefore ^ ^ (134) 

The latter satisfies the Schrodinger equation 

^ (B„ - C/)^„ = 0. (135) 


Upon eliminating En between (134) and (135), we obtain the 
equation 




Airim 
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in which the coefficients are independent of the index n. Therefore 
this equation is satisfied by all components and hence also by 
any of their linear combinations. 

We may now say that the probability amplitude even in the 
case where the energy of the particle is not determined and hence 
where the waves are not ^'monochromatic/^ will always satisfy 
the equation 

iTimdip 

A4, - ^ UrP = - (13f.) 


which we shall call the time-dependent Schrodinger equation. 

It is to be noted that since there is an imaginary coefficient in 
(136), the complex conjugate of yp does not satisfy the same 
equation but does satisfy the following one: 




STT^m 

~W 


up* = + 


47rm dp* 


(136') 


31. Current density. In analogy to the intensity of illumina¬ 
tion, defined statistically in §19, it is convenient to define the 
(probabilistic) particle flux density. We shall show that there 
exists a vector i such that in da dt (where da is an element of surface 
and n is the normal to it, with a given positive direction) represents 
the probability, taken algebraically,^® that the particle crosses the 
element da in the time dt. 

For this demonstration we shall suppose (as in the footnote on 
page 145) that there is not one, but a great number N of systems in 
the same conditions, and not interacting. 1‘he mean density of 
particles will be |^/. Therefore in any enclosed space S there will 
be 

N f^H*dS 

particles on the average. 

The increase of this number in unit time is 

That is, the difference between the probability of crossing in the positive 
sense and that in the negative sense. 
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On the other hand, this quantity must be equal to the average 
number of particles which enter the volume S in unit time across 
the surface <r. Therefore we must have 


N 


4:Trim Js 


ArP* - A^) dS 




da 


(taking the normal to be directed inward). Transforming the 
volume integral into a surface integral, we have 


N 


^wim 


/.(■ 


^ dn ^ dn 


^da =N i„ 


da. 


In order for this equation to hold for any o-, it suffices to take 
h 


1 = 


47rm 


{yp* grad \p — yp grad yp*). 


(137) 


This is the desired expression for the probability current density. 

It is to be noted that if the particle is in a ‘^stationary state 
(see §29), that is, if yp has the form (128), the current density 
becomes 

i = (^* grad u — u grad u*) (137') 

and is independent of time, just as is the average density ypyp* — uu*. 
For this reason these states are called stationary states. 

If we are dealing with particles carrying an electric charge e, the 
vector ] = e\ obviously represents the average value of the electric 
current density. • 

In the first works on wave mechanics ypyp* was not interpreted as a 
probability density or an average density but as a true density, so that 
the electron was thought of as a continuous distribution of electricity, of 
electrical charge density p = eypyp*. Consequently, the vector j = d 
defined by (137) was interpreted as a true electric current density (not 
average), and on this basis the electromagnetic field produced and the 
emitted radiation were calculated by using the ordinary laws of electro¬ 
magnetic theory. This hypothesis, although leading to correct results in 
some cases, was in manifest contrast with all phenomena in which the 



162 


WAVE MECHANICS OF A PARTICLE 


[§31 

electron exhibits its corpuscular nature, and was therefore soon abandoned. 
Born was the first to suggest the probabilistic interpretation, which was 
later perfectly incorporated into the framework of the uncertainty principle, 
as has been shown in the preceding sections. 

32. Electromagnetic field and radiation. When the particle 
under consideration is electrically charged (for example, if it is an 
electron), the problem arises of determining its electric and magnetic 
effects upon other particles and, in particular, its radiation. This 
latter question has been answered in a precise manner by a theory 
developed by Dirac, for which the reader is referred to other works. 

We shall limit ourselves here to mentioning that we may get an 
estimate of the average value of the electric field or of the magnetic 
field (see footnote on page 145) by applying the ordinary laws of 
electromagnetism and taking for charge and current densities the 
average values 

P = (138) 

he 

j = ei = (^* grad ^ ^ grad ^*). (139) 

In the case of a stationary state these expressions become 

P = euu*, (140) 

he 

j = (u* grad u — u grad u*), (141) 

and are independent of 1. In this case, then, the average distribu¬ 

tion of charges and currents is stationary. Of course, the average 
values of the electric and magnetic fields will then also be inde¬ 
pendent of time. In particular, if u is real, we have j = 0; that is, 
the average distribution of charges is not only stationary but also 
static. Such a solution always exists, the coefficients of (131') being 
real; as a matter of fact, in cases where there is degeneracy, it is the 
only solution (except for an unimportant factor with B constant). 

As for the radiation emitted, we shall confine ourselves here to 
stating the essential results of the Dirac theory. If, at the time 
f == 0, the system finds itself in a stationary state of energy En 
(which is not to be the lowest one) and we observe it at the time 
there is a certain probability, increasing with ty of finding it in a 

See also E. Fermi, “Quantum Theory of Radiation,’' Rev, Mod, Phye, 4, 
pp. 87-132 (1932), or Nos. 15 and 29a of the Bibliography. 
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state of energy < En] the difference in energy being emitted in 
the form of radiation of frequency 


Vnm 



(142) 


(This formula, which in the Bohr theory constitutes a postulate in 
itself, is derived instead, in the Dirac theory, from the general 
principles of quantum mechanics). This probability, in every 
infinitesimal interval dt^ increases by Pnm dt, where 

Pn™ = ^3 + FL + ZU- (143) 

In this formula, X^m, Tnm, and Znm represent three integrals into 
which the eigenfunctions of the two stationary states enter, that is, 
the initial and the final eigenfunctions; 

Xnm = ^ ^ XulUm dS = (144) 

(and analogous expressions for the other two). If initially we have 
a great number of atoms in the nth state, the intensity of the 
radiation of frequency Vnm emitted at every instant will naturally be 
proportional to their number and to Hence the latter is 

essentially determined by the three quantities X nntf Xnntf and Znm. 
If, in particular, all three come out to be zero, the transition from 
the nth to the mth state has zero probability, or is ‘Horbidden”; 
the corresponding spectral line will be missing. In this manner 
justification is given to the so-called ‘‘selection rules,^' which were 
long known by spectroscopists and which were explained in the 
Bohr-Sommerfeld theory by means of the correspondence principle 
(see §64). 

Furthermore,’ we find that each of the three quantities X„m, 
Ynmy and Znm determines a component of the amplitude of the elec¬ 
tric field in the emitted light, so that we may also get the state of 
polarization of the latter from them. These quantities correspond 
to what in classical theory are the amplitude components of the 
electric dipole moment of the emitting system. If, for instance. 

It will be seen in §33 of Part III that X«m, Tnm, and Znm are the matrix 
elements which represent the components of the electric dipole moment in 
quantum mechanics. 
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Xnm == Ynm = 0, Znm 9^ 0, the emitted radiation will correspond to 
that of a linear oscillator parallel to the 2 :>axis, and hence its plane 
of vibration (plane containing the ray and the electric vector) is 
parallel to the 2 :-axis. 

Analogous arguments may be made in connection with absorp¬ 
tion (see also §43 of Part III). 



CHAPTER 7 

One-dimensional Problems 

33, Characteristics of one-dimensional problems. In order to 
study the Schrodinger equation in the simplest cases, we shall treat 
in this chapter some ^^one-dimensionar^ problems, by which we 
understand that in in and in all other quantities which 
eventually enter, only one of the space coordinates (in addition to t) 
occurs, for instance only x. 

The physical characteristic of these problems is that the particle 
may be found with equal probability at all points of any plane 
normal to the x-axis, or that its coordinates y and z are completely 
indeterminate, able to take on any value with equal probability. 
We are looking only for the probability distribution of x. We can 
also say that we are studying the motion, not of the particle P, but 
only of its projection P' on the x-axis. Therefore these problems 
correspond to those of classical mechanics in which the motion of a 
point along a straight line is studied. We shall make use of the 
same terminology and shall speak of the motion of the particle along 
the x-axis (by which we refer to its projection P'). Incidentally, 
we note that the particle will necessarily be subject to the conditions 
which we are now considering, whenever we suppose that the particle 
is constrained to move parallel to the x-axis (not on the x-axis). 
In fact, by virtue of the uncertainty principle, this condition 
(py = p, = 0) implies the complete indeterminacy of the coordinates 
y and z. 

With this premise, (131') becomes, in our case, an equation with 
ordinary derivatives: 

g + C/(x)]M=0, (145) 

which we shall call the one-dimensional Schrddinger equation (for 
stationary states). 

We shall now show the application of this equation to some 
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particular problems, with the idea in mind of illustrating more 
effectively, by means of analytically simple examples, the spirit of 
wave mechanics; some of these problems are also susceptible of 
direct physical application. 

It will be useful to keep in mind, as an intuitive guide to the 
solution of these problems, their formal analogy with optical 
problems in which the light propagates by plane waves along the 
x-axis and in which the index of refraction (independent of y and z) 
varies along x according to y/E — U, In these one-dimensional 
problems we may also invoke the analogy with the transverse 
oscillations of a string, making u correspond to the displacement of 
the string, and its density to (E — U). 

34, Qualitative discussion of one-dimensional problems. Equa¬ 
tion (145) is of the type 

dhi . s ^ 


with f{x) = {E — U), If we represent the potential ?7 as a 

function of a: by a graph (Fig. 24) and if we draw a horizontal line of 
ordinate E, the intersections yield the values Xi, X 2 , . . . of x for 



which /(x) = 0. These points divide the x-axis into regions, in 
some of which / > 0, in others / < 0. In classical mechanics the 
first regions (those for which the difference — Z7, which represents 
the kinetic energy, is positive) are the only ones in which the particle 
may move, the total value of the energy E being assigned. On the 
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contrary, the other regions (in which E — U < 0) are inaccessible 
to a particle of energy E, As we shall see, this difference does not 
hold true in wave mechanics, although even here the distinction of 
the two regions has an important significance, because the behavior 
of u{x) is considerably different in them. 

We observe first of all that in the regions where f > 0, u and 

have opposite signs, and hence the curve representing u is 
ax 

always concave toward the a:-axis there; conversely, it is convex 
toward the x-axis in regions in which / < 0. It is clear, then, that 
in the first regions the curve may cross the x-axis several times with 
an oscillatory behavior; whereas in the second regions the curve, if 
it crosses the x-axis even once, cannot turn back and cross the axis 
again (in the same region) since it has to stay convex, and hence one 
of these regions may at the most contain a single node. Thus the 
curve is no longer oscillatory in character. This concept becomes 
clearer if we think of the particular case / = constant. If this con¬ 
stant is positive, the curve is sinusoidal (oscillatory in character, con¬ 
cave toward the x-axis); if it is negative, the curve is of the exponen¬ 
tial type (nonoscillatory in character, convex toward the x-axis). 

This consideration turns out to be useful in a discussion of the 
qualitative behavior of the eigenfunctions corresponding to a given 
potential curve and to a given energy level. For instance, let us 
consider a potential of the type shown in Fig. 24, that is, a potential 
possessing a single minimum and tending monotonically toward 
infinity to the right and left of it in any manner. The harmonic 
oscillator, for example, belongs to this class (see §39). For any 
value of E, let us consider a curve u (dashed curve) representing a 
solution which to the left approaches the x-axis asymptotically. 
Toward the right it moves away from the x-axis, being convex 
toward it, up to point A, where it will start a number of oscillations 
over the portion AB^ only to assume, to the right of B, a nonoscilla- 
tory behavior which in general will make it move away indefinitely 
from the x-axis. Only if E has one of the values En (eigenvalues), 
will the curve approach the x-axis to the right too; it will then repre¬ 
sent an eigenfunction Un (solid curve). 

The portion over which the curve shows an oscillatory character 
is evidently the region in which the particle would oscillate accord¬ 
ing to classical mechanics. 
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36. Particle of definite energy, not subject to forces. Let us 

apply the one-dimensional Schrodinger equation to the case of a 
particle without forces, free to move from a:=~-ootox = +oo. 
We may set = 0 in (145), and hence, substituting for con¬ 


venience 


we may write the equation 

^ + 4ir^k‘‘u = 0, (148) 

k being constant. This is the equation studied in §8 and has for 
general solution 

u = a + b 


whence 


^ = (a ^ , 


We immediately recognize that E may not be negative (and 
hence that k must be real) because otherwise u would become 
infinite for + co orx—> —oo^ which case must be excluded (see 
§28). Aside from this consideration, k^ and hence E are completely 
arbitrary, as we have seen, between 0 and + oo (continuous eigen¬ 
value spectrum); that is, the particle may have any (positive) 
energy. Let us suppose E to be fixed once and for all. To this 
fixed value there correspond two linearly independent solutions 
represented by the two terms in (149), and thus we are dealing with 
one of the cases of degeneracy studied in §6. 

Let us first consider only the first term; that is, let us set 6 = 0^ 
and take 

^ = a (151) 

This formula represents a traveling plane wave train (see §12) 
of wavelength 

X = i (152) 

and of (phase) velocity 


k hk 


It is seen that according to classical mechanics, there correspond 
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to the energy E a velocity and a momentum of the particle, given 
respectively by 

^ = i 'x/ 2mE == + hk. 

\ m 

Hence for a particle in progressive motion we have 

p = hky 

and then (152) may be identified with the already known formula 
X = h/p^ while (151) takes on the form 

2W/ 

^ ac^ 2m (151') 

The constant a is determined by the normalization procedure 
for continuous spectra explained in §10. We note that the prob¬ 
ability density for the position of the particle, given by turns 
out to be independent of x. This result is in agreement with the 
fact that, since the momentum is determined with infinite accuracy, 
the position of the particle remains completely uncertain, in accord 
with the uncertainty principle. 

Analogous considerations may be made concerning the second 
term of (150), which we may also put in the form (151') setting 
p = —hk. This term then represents a wave traveling in the 
opposite direction and corresponds to a particle whose momentum 
is directed in the negative sense. We see that the degeneracy of 
the problem in wave mechanics has its origin in the fact that in 
ordinary mechanics there correspond two values of p to a given 
value of J5, one positive and one negative. 

Considering now the general solution (150) and keeping the 
principle of superposition in mind, we can say that this solution 
represents the case iti which the energy of the particle is determined 
while the direction of its momentum is not determined, so that 
there is a certain probability, proportional to \a\^, of finding the 
particle in forward motion, and a probability, proportional to |?;|^, 
of finding it in backward motion. 

36. Particle of undetermined energy (wave group). In practice, 
the most interesting case is the more general one in which the prob¬ 
ability density Pq(x) for the abscissa of the particle is prescribed at 
time i = 0, as well as the probability density Qo(p) of the initial 
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momentum. These two functions will be represented by curves 
whose shapes depend on the type of experiment used to define the 
initial state of the system. 

^ will generally be of the form (133'), as has been pointed out in 
§29 (sum or integral of an infinite number of stationary solutions 
corresponding to the different values of E). Since, however, as has 
been shown above, to each value of E there correspond two solutions 
differing in the sign of p, it is convenient to characterize the indi¬ 
vidual components by the value of p rather than by the value of E, 
or else to write 

2ir»/ P®j\ 

= a j (po(p) ' dp, (1^^) 

where a is a normalization constant which will be fixed later and 
^o(p) has a meaning analogous to c{E) in (133'), that is, 

ko(p)P =Oo(p). (155) 

It is to be noted that this relation determines only the absolute 
value of v’o(p), leaving the argument B arbitrary. Hence we shall 
write 

v?o(p) = VQo(p) (155') 

and the function ^(p) must be determined in such a way as to satisfy 
the other initial condition, that is, so that 

l^ol^ = Po(x), (156) 

where is ^ for ^ = 0: 

f 2W 

ypoix) = a ^o(p) dp, (157) 

We shall see shortly under what conditions this is possible. 

It will be observed that (157) may be identified with (58), §12, 
by identifying with / and setting p = hkj and 

A{k) = a<po(hk)h; (158) 

consequently, (59) yields 

1 /* * 

Noting that by virtue of (51') of §12, we have 


(157') 
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and that, because of the normalization condition, both integrals 
of this formula must be equal to 1, we find that we may put' 

“ - ^ 

into the preceding formulas. 

In the most interesting cases, the initial position of the particle 
is defined with a certain approximation, so that the curve of Po{x) 
is bell-shaped, as is, for example, the Gaussian error curve. We may 
then say that (154) represents a “wave groupwhich initially has 
the “profile^' (see §14) defined by the curve Po(^), and then dis¬ 
places itself (while deforming) with a group velocity given by (74), 
which in the present case immediately gives v — p/m. The group 
velocity may thus be identified with the particle velocity in classical 
mechanics, as is to be expected, by virtue of the manner in which 
the Schrodinger equation was constructed. 

The uncertainty Ax in the abscissa of the particle at a given 
instant is defined by the formula [analogous to (65)] 

{Axy = /(x ~ x)hlAl/* dx (160) 

and similarly for the uncertainty Ap in momentum 

(ApY = /(p - PY<P<P* dp, (161) 

a formula which, when the relation between p and k and (158) is 
recalled, coincides with (63). Thus we see that the two uncer¬ 
tainties Ax and Ap are subject to the limitation imposed by (66), 
which is now written 

Ap Ax > ^; (162) 

hence the two curves Po{x) and Qo{x) may not be assigned arbi¬ 
trarily. If they are'assigned in a manner such that this condition 
is not satisfied initially, there exists no d{p) which when put into 
(155') satisfies (156); therefore it is not possible to construct a wave 
group having the required properties. Thus the scheme of wave 
mechanics is verified, with respect to the uncertainty principle, for 
material particles. This result was to be expected, since the formal¬ 
ism of wave mechanics is analogous to that of wave optics. 

^ The arbitrary constant of modulus 1 by which a could be multiplied, and 
hence also has no influence upon the probability, and hence is of no importance. 



172 


WAVE MECHANICS OF A PARTICLE 


[§36 


Of particular interest is the case in which the curves Po{x) and 
Qoix) are such that at time zero we have 


AxAp 


(163) 


(that is, such that the product of the uncertainties is the least 
possible). As we have seen in §13, this case requires that the 
functions ^o(x) and (p^ip) be of the form (70) and (72). In the 
present notation, if for brevity we set 


\/2(Ar)o 




1 


\/2(Ap)o 


(164) 


(that is, if we indicate by a and (3 the precisions of the initial 
determinations of x and p), the functions \l/o{x) and <pq{p) become 


}Po(x) = Ce - ^ , 

27rt. 

<Po{p) = De ^ . 


(165) 

(166) 

The curve of the initial probabilities are thus of the Gaussian 
type, as follows: 

Po(:r) = (7(7* (167) 

QoCx) = Z)77* (168) 

Equation (163) is equivalent to the following relation between 
a and /3 

27r 


a/3 = 


h 


(163') 


The moduli of the constants C and D are determined by the 
normalization conditions, which yield 


CC* = 


DD* = 


a 


•x/tt 


(169) 

(170) 


Let us now calculate the probability curve P(x) of position 
at the time i. 

* Note that the terminology of error theory is being applied to a type of 
uncertainty of an entirely different origin from that of the errors of observation. 
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We must first of all (ialculate ^ at time I by means of (154). If 
we introduce expression (166) and set for short 


x — 2o = ?, 


p — po = V, 


(154) becomes 


* Vk 


.-(f+£)•■+”■(<-&) 


The integral may ])e easily reduced to known definite integrals.^ 
It proves to be 

IT* \ m / 
r ^ irt/ 


3 ^ , irit 

rT I i. 


After ^ has been calculated thus, 4^* will give the function P{x) 
at time t. In order to write the square of the modulus of expression 
(171) in a simple form, it is convenient to introduce the notation 


Thus we obtain 


Vt + (S) 


P(x) = DD*^e “’h mO ; 


or, if we note that because of (169) and (170) 


and introduce x again: 


DD* ^ CC* 


P{x) = CC*^e ») . (173) 

This formula, when compared with (167), shows that the prob¬ 
ability curve at time t is still a Gaussian, but its maximum, instead 
of being at x = is at 

X = fo + — <; 

m ' 


* Specifically, to the integral 




/- ^ 
di - ^ e*“’. 
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that is, the maximum moves with velocity po/^, as we already know. 
Furthermore, the precision is no longer a but at, given by (172), 
which on account of (163') may also be written 




(172') 


This formula shows that the precision diminishes with time, 
that is, that the wave group spreads out more and more as it propa¬ 
gates (after time / = 0), in correspondence with the fact (also valid 
in ordinary mechanics) that when the initial velocity is not exactly 
determined, the uncertainty in position grows with time. We note 
that if we want to make use of a measurement made at time zero in 
order to calculate the position at a previous instant (^ < 0), again 
we find an indeterminacy which is the larger, the further that 
instant is removed from the instant of measurement. 

37. Potential step. Let us suppose that a particle of definite 
energy E is subject to a potential U(x) represented in Fig. 25. 

IJ = 0 for a: < 0 (region I), 

U = Uq for x > 0 (region II). 

This purely idealized condition may be considered as the limiting 
case of the following. Let us suppose that over a segment PP' there 

exists a force field which forces the 

j g ^ _ C particle toward the left (as could be 

/\ realized, in the case of an electron, 

/ j by two charged grids of opposite sign 

/ I located at P and P'). Then the po- 

_ Z- J ! _ tential would have the shape APQC, 

A POP' Now letting the interval PP' tend to 

Ffs. 25 zero while maintaining the difference 

of potential between P and P' (and 
hence making the field become infinite), we reach the limiting case in 
question. It may therefore be considered a ‘‘double-layer potential.’’ 

The Schrodinger equation in region I will still be given by (148), 
whereas in region II it will have the same form except that instead 
of k there enters the constant 
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which may be either real or imaginary, according to whether E > L\ 
or E < r/o. 

The function u will therefore be given by 


u — Ai (for x < 0), (174) 

u = yl 2 + B 2 (for x > 0); (174') 


and since u must be continuous, together with its first derivative, 
for X = 0, the four constants Aiy ^ 2 , Bi, B 2 must be connected by 
the relations 


where we have put 


^ 1 -f- Bi ~ A2 “b B2 I 
A,- B,^ m(^2 - B 2 ) r 



(175) 


It is now convenient to distinguish two cases according to 
whether the energy E of the particle is (a) above or (b) below the 
difference Uq of the potential levels. We shall suppose in each 
case that the particle is moving from — 00 toward 0. 

Case (a>): E > Uo. In this case, according to classical mechanics, 
the particle would overcome the potential step and would continue 
its motion to the right of 0 with a velocity reduced in the ratio 1: /x. 
From the point of view of wave mechanics, however, we note that 
each of relations (174) and (174') represents the superposition of for¬ 
ward and backward moving waves. Since we suppose that the 
particle comes from — 00 and not from + 00 , there must not be any 
waves moving backward in region II, and hence B 2 = 0. Equa¬ 
tions (175) then yield, upon elimination of A 2 , 




L- M 

1 + M 


Ai. 


In general, then, there will be in region I, in addition to forward- 
moving waves of amplitude Ai, waves moving in the opposite 
direction, reflected from the potential step, with amplitude Bi, The 
meaning of these constants can be stated more easily if we think not 
of a single moving particle but of a large number of particles. Then 
AjAf is proportional to the number of particles (per unit length) 
which are moving forward in the region to the left of 0, whereas 
B,BX is proportional to the number of particles moving in the 
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R = 


B,BX 

A,At 



2 


represents the fraction of particles which are reflected, or the prob¬ 
ability that one particle will be reflected. Therefore R may be 
called the reflection coefficient of the step, and 


T ^ I - R 


represents the transmission coejflcient. 

Hence an incident particle has, according to wave mechanics, 
a certain probability R of being reflected (with the same velocity) 
and a certain probability r passing over the step (with a velocity 
reduced in the ratio l:/i, since the wavelength \/k becomes 1/ko), 
The problem is analogous to the reflection of light from a semi- 
reflecting surface; indeed, the reflection coefficient in optics repre¬ 
sents the probability of an incident photon being reflected. 

Case (b): J? < L\. In this case, according to classical mechan¬ 
ics, the particle would be thrown back without ever passing over the 
step. In the wave treatment, instead, we must observe that fco and 
fji are imaginary. Let us therefore put 

A^o = ik', II — ijji'y 

with k' and /x' real, and let us write (174') in the form 

u ^ A 2 + B 2 (x > 0). (176) 


We then note that for u to remain finite for x —> qo we must 
have B 2 = 0. If this stipulation is kept in mind, (175) immedi¬ 
ately gives 




1 ~ ifi 
1 + ifi 



Since 1 + ifi' and 1 — ifi' have the same modulus, we can see that 
the reflected waves have an amplitude equal to the incident waves; 
that is, we have 


R = 


B^Bt 

A,A* 


1 , 


which means that all the particles are reflected. However, u is 
different from zero even to the right of 0, where it is given by (176), 
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which reduces to 

u == ^2 6-2-^'". (177) 

Hence the real part of u (and similarly the imaginary part) has 
the shape shown in Fig. 26. 
observation of position of the 
particle, there is a certain prob¬ 
ability of finding it also to 
the right of 0, a probability 
which is appreciable up to a 
distance of the order of l/k' 
from 0, In this result wave 
mechanics differs sharply from 
classical mechanics. 

This result might appear paradoxical from the classical point of 
view because in the region to the right of 0, the potential energy Uo 
of the particle would be greater than its total energy E. Hence the 
kinetic energy E —. Uq would be negative and the velocity imagi¬ 
nary. This is the reason why this region is inaccessible to the 
particle according to classical mechanics. However, from the point 
of view of wave mechanics, the result is not paradoxical at all. In 
fact, we must not think that the result means that the particle 
penetrates a little into the potential step before it is reflected (an 
interpretation contrary to the spirit of wave mechanics, according 
to which the motion of a particle may not be followed); instead, it 
means that upon making an observation of position, we may find 
the particle also to the right of 0. Now it must be remembered 
that a direct observation to localize the particle inevitably requires 
that a (unknown) momentum be imparted to the particle, and hence 
that its energy will be altered. Therefore, upon finding the particle 
to the right of 0,-we cannot say that we have found a violation of 
the principle of conservation of energy, since the fact may be inter¬ 
preted, in corpuscular terminology, by saying that the energy 
necessary to overcome the step was imparted to the particle in the 
very act of observation.'* 

^ This consideration may be rendered more precise by means of a more 
detailed analysis which we shall not make here. We restrict ourselves here 
to pointing out briefly the following objection which apparently may be made 
to the argument above: Is it not possible to perform the observation in such a 
way as to have an uncertainty Ap in momentum less than the momentum 
which classically is necessary to overcome the step (that is, less than 


Consequently, if we perform an 
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The phenomenon here considered is analogous to a known 
phenomenon of optics, where in the total reflection of light there also 
exists a luminous disturbance in the second medium, in the immedi¬ 
ate neighborhood of the reflecting surface. However, in the case of 
light waves total reflection occurs only if the angle of incidence is 
greater than the critical angle, whereas for de Broglie waves the 
phenomenon occurs even at normal incidence, as we have seen. 

38. Particle on a limited line segment. We shall now show a 
first example of quantization by the Schrodinger method, consider¬ 
ing a particle which may move (without forces) on a limited portion 
of the line AB, recoiling elastically at the endpoints. We shall look 
first of all for the stationary solutions (that is, with E definitely 
determined). 

The Schrodinger equation will still be given by (148), but with 
the condition that ^ vanish outside of the segment AB, the prob¬ 
ability of finding the particle there being zero by hypothesis. 



Fis. 27 


The conditions of this idealized problem may be expressed in a 
manner which is more apt to show its physical significance if we will 
think of the two obstacles A and B as being two 'potential steps of 
height Uo (Fig. 27) and then imagine that f/o tends toward infinity. 

— \/E))t while being resigned to having a Ax > — by the 

4ir Ap 

uncertainty principle? In that case, if from the measurement x > Ax, in spite 
of the uncertainty in x we should be certain that the particle is to the right of 0, 
This reasoning is erroneous because Ax and Ap represent the average values 
of the spreads and not their maximum values which, by a theorem stated in §13, 
cannot both be finite. Furthermore, since the position of the particle is not 
a priori altogether uncertain (it being practically certain that x < 1/fc'), it is 
not possible to make Ap as small as we w ish (i n fact, the momentum of the 
part icle ev en before the observation is ± •\/2mE, and hence has an uncertainty 
2 \/2mE), 
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In this manner k' of (177) approaches oo, and hence the exponential 
of Fig. 26 becomes equal to zero immediately beyond the step. We 
can thus fix our attention only upon the behavior of ^ within the 
segment AB^ imposing the condition that yp must vanish at the 
endpoints. 

Since equation (148) is identical with the one already studied in 
§8, with made to correspond to the parameter X of (21), the 

present problem has already been treated in §8 under conditions 
(a). It follows then from (24') that k may take on only the values 


kn 



n = 1, 2, 3, . . . . 


Evidently, these values are those for which the half wavelength 1 /2kn 
is a submultiple of the length 2L From this equation we obtain, by 
means of (147), the energy levels, which are 

- 52rf 


To these there correspond the (normalized) eigenfunctions given 
by (25): 


and hence 


y[j i 

(179) 

2W_ 

. TT . ... 

e sm n ^ (x + /). 

(179') 


The probability density of position at any instant is 
P(x) = = uu* = sin- 71 ^ (‘^ + 0 


(180) 


and hence possesses nodes which divide AB into 7i equal parts. 

The probability distribution of the momentum is obtained by 
noting that (179') may be written 


.«T 27ri 

rp. = Cl ” 


.nr 2Ti„ . 


where Ci and C 2 are given by (23') of §8. Since the first of these two 
terms represents forward-moving waves of ‘^wave number" kn and 
the second term represents identical waves moving in the opposite 
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direction, at every instant there will be a probability proportional 
to \ci\^ of finding the particle with momentum pn ~ hkn (see 
page 169) and a probability proportional to \c 2 \‘^ of finding it with 
momentum pn == ^hkn. From ecpiations (23') we then find that 
Cl and C 2 have the same modulus, and hence the two probabilities 
are equal. Therefore the probability of the momentum is not con¬ 
tinuously distributed but is zero everywhere except for the two 
values ±hkny where it is equal to ^' 2 - It is also verified that the 
product of the uncertainty in x (that is, 21) and the uncertainty in 
p (that is, 2hkn) is nhj and hence of an order of magnitude not less 
than h. 

Let us now proceed to the case in which E does not have a 
definite energy (or where the system is not in a stationary or quan¬ 
tum state). In this case xp is represented by a sum of terms of the 
form (179'); 


00 



n«l 


and \cn\^ represents the probability that the system is in the 7ith 
state, that is, that its energy is En. As we know, the coefficients Cr^ 
are subject to the restriction S|Cn|“ = 1, which is equivalent to the 
normalization of rp. 

In particular, the Cn may be determined so as to constitute a 
small wave group, and we find then that this group moves back and 
forth between A and B in a way similar to the motion predicted by 
classical mechanics but spreading out gradually, like the group 
considered in §36. 

39, The harmonic oscillator. A harmonic oscillator is a system 
made up of a material point particle moving along a straight line 
and attracted toward a point 0 of the line with a force proportional 
to the distance. If we take the line to be the x-axis and 0 the origin, 
the force acting upon the point will thus be —Kx, where if is a 
positive constant. It is known that according to classical mechanics 
the point will perform simple harmonic oscillations about 0 with 
arbitrary amplitude and phase (depending upon the initial condi¬ 
tions), and frequency 



1 


( 182 ) 
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In order to investigate the corresponding problem in wave 
mechanics, we note that the potential energy corresponding to the 
force —Kx is 

[/ = ^ (182') 

and hence, putting this into the one-dimensional Schrodinger 
ecpiation (14(>), we obtain 

£ + - (E - 2ir‘‘mvlx^)u = 0. (183) 


We are to find the eigenvalues and eigenfunctions of this equation 
for the interval — oo to + oo. 

Let us first of all reduce the ecpiation to a form whicdi will put its 
analyticud aspect clearly in evidence. In order to do this, it suffices 
to change the scale of x by introducing the variable 


V h 


and then by dividing the entire equation Iiy 47rhnyo/h, after which 
it becomes 


d-u 

de 


+ (e - == 0, 


(183') 


where we have abbreviated by e the constant 



(184) 


which represents the energy measured in units of /iro/2. 

Now we have to find whether (183') possesses solutions which are 
finite and continuous everywhere and which tend to 0 for J tending 
toward . It will be found that such solutions are possible onljr 
if € is equal to an odd positive integer. 

Equation (183') may be discussed in the following manner.. 
First of all we note that its coefficients are finite for all finite values, 
of hence any possible singularities of u may occur only for f = 
± 00 . With the criterion of §16 we recognize that in effect the 
equation has two singularities of the non~Fuchsian type at infinity. 
An idea of the asymptotic behavior of the solutions at these points 
may be had by the following heuristic method. If we attempt ta 
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satisfy the equation by letting u = we shall find that, neglecting 
the terms which are finite with respect to those in the equation 
may be satisfied asymptotically by taking X = ±3^. Since we are 
looking for solutions which do not go to infinity, we must discard 
those having for asymptotic expression. Let us therefore put 

2/ = (185) 

and let us find whether we can determine v so as to satisfy the equa¬ 
tion exactly and so that in addition u tends toward zero at infinity, 
for which it suffices that v have no essential singularities there. 
Substituting (185) into (183') we find for v the equation 

2 ;" - 2 ^ 2 ;' + (6 - 1 ) 2 ; = 0 . ( 186 ) 

Let us try to integrate this equation by a series of the form 


V 




r»0 


(187) 


from which we exclude the negative powers of since we want the 
solution to be finite also for f = 0. Substituting this series into 
(186), we find 


[(r + l)(r + 2 )ar +2 + (c — 1 — 2r)ar]f’' = 0. 

In order for this equation to be satisfied identically, all coefficients 
must vanish; this condition gives for ar the recursion relation 


2r + 1 ~ € 

- (r + l)(r + 2) 


(188) 


It is characteristic of this equation that it relates each coefficient 
to the one preceding it by two; thus, having fixed an arbitrary ao, we 
obtain from it by means of (188) all the coefficients with even index; 
and by fixing an arbitrary value for ai, all the coefficients with odd 
index. Two fundamental solutions are obtained, for instance, by 
taking 


or else 


ao 0, ai ~ 0 (series of even powers) (189) 

ao = 0, ai 7 *^ 0 (series of odd powers), (189') 


and any other solution will be a linear combination of these two. 
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Now it may easily be shown that, in general, these two series 
have essential singularities for ^ = ± oo. Only in the case that 
one of the coefficients, for example an+ 2 , vanishes is this not so, 
because then, according to (188), all the successive coefficients will 
also vanish and the series reduces to a polynomial of the nth degree. 
The condition for an +2 = 0 with an 9^ Q is, as may be seen from 
(188), that 

€ = 2n + 1. (190) 

Therefore the eigenvalues of (183') are all the positive odd 
integers. 

From (184) and (190) it follows that E must have one of the 
values 

En = {n -{■ i)hvo (n = 0, 1, 2, . . .). (191) 

Hence these are the energy levels of the harmonic oscillator. It 
may easily be verified that the so-called “zero-point energy'' 
(Eo = ihpo) is a necessary consequence of the uncertainty principle. 

In this connection we note that the old Bohr-Sommerfeld theory 
yields, as we shall see in §54, the formula En = nhvo instead of 
(191). (This may be remembered by stating it as follows: the 
energy of the harmonic oscillator must be an integral multiple of 
the energy of a photon having a frequency equal to the charac¬ 
teristic frequency of the oscillator.) The difference between two 
successive energy levels, occurring in the phenomena of emission 
and absorption, is therefore the same in both theories; but the lowest 
level, which is zero in the Bohr-Sommerfeld theory, turns out to 
be ihvQ. A great number of experimental data (band spectra, for 
example) confirm formula (191). 

Let us now proceed to an examination of the eigenfunctions. 
When € is given by (190), the recursion formula (188) becomes 

2(n — r) 

(r + l)(r + 2) 

Also, if n is even, we must consider the solution in even powers 
(ai = 0, ao 0, and arbitrary); if n is odd, we must take the solu¬ 
tion in odd powers (ao = 0, ai 0, and arbitrary). Such poly¬ 
nomials occur in the development of the successive derivatives of 
the function In fact, it may easily be verified that 

^e-t‘ = (-l)"F.a)e-P 


(193) 
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where HnU) is a polynomial of the nth degree, of the type we are 
considering. These Hn are called Hermite 'polynomials,^ The first 
four of these, for instance, are (as may easily be found) 

//() = 1, 2^, 

ifs = “2 + Ae, H, = -12^ + 8^ 

Hence to tlie eigenvalue there corresponds a ?; given by 

v„ = H„i^) 

(where Nn is a constant normalization factor) and hence, by (185), 
a u given by 

«„ = ^ H„a) 

The normalization condition is 


or else 



7il = 1 


d^. 


The integral is calculated })y successive integrations by parts, 
use being made of the following property of the Hermite poly¬ 
nomials : 

(194) 

and we find Nn = — V^'‘w!\7ir. 

\47r*'m^'o 


By a similar procedure we may verify that 

= 0 

for n 9 ^ m, that is, that the functions Un are orthogonal. 

The functions lu for the first five values of n are represented in 
Fig. 28, neglecting the factor \/h/(47r^myo) (the first of these func¬ 
tions, n = 0, is just the well-known Gaussian). The function 
which gives the probability distribution of position of the particle 
if it is knowm that the latter is in the nth quantum state, is repre- 
‘ See, for instance, No. 25 or No. 34 of the Bibliography. 
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sented in Fig. 29 for the first five values of n. In these diagrams 
the segment OC represents the amplitude of the oscillations accord¬ 
ing to the classical model {OC = \/2En/K). In this connection 
we note that for x > OC we have U > En^ and hence the kinetic 
energy {En — t/) for a particle to the right of C turns out to be 
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negative, or its velocity imaginary. Therefore that region is 
inaccessible to the particle of energy according to classical 
mechanics. Nevertheless, as is shown by the curves of Fig. 29, 
there is the possibility of finding the particle in that region. This 
is an apparent paradox analogous to the one already explained 
in §37. 

We now recall that if, instead of the energy of the system being 
determined, the initial position and velocity of the particle are 
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ascertained (with the approximations required by the uncertainty 
principle), then the probability distribution, for a consequent 
determination of the x coordinate, is given by the square of the 
modulus of a \l/, which is obtained as a linear combination of solu¬ 
tions corresponding to various quantum states. This combination 
represents a wave group which is the more limited the more pre¬ 
cisely the initial determination of x was made. Schrodinger has 
shown that this wave group will move with an oscillatory motion 
approximating the harmonic motion of classical mechanics. In 
this case, however, in contrast to what is encountered in general 
(see §36, for example), the wave group will retain its initial dimen¬ 
sions indefinitely without spreading out. 

40. Potential barrier. Of interest for its applications (some of 
which will be mentioned in what follows) is the problem of the 
motion of a particle along a line x, under the action of a potential 

which is zero everywhere except over a 
certain interval AB^ where the poten¬ 
tial increases to a maximum Uo and 
then falls to zero again; this behavior 
is shown qualitatively in Fig. 30. The 

L- I -J region A B constitutes what is called 

Fig. 30 ^ 'potential harrier. A particle which, 

for instance, originates to the left feels 
no force up to A, where it encounters a retarding force from A to 
C", and from there on an accelerating force from C' to B, then again 
no force at all. 

According to classical mechanics, the particle will overcome the 
barrier if its initial kinetic energy E is greater than the maximum Uq 
of the potential; otherwise it will be thrown backward before 
reaching C'. 

Wave mechanics instead gives a different result: the incident 
particle has in every case a certain probability of passing over the 
barrier (even if £ < and a certain probability of being repelled 
(even il E > L\). In order to find this result in a simple manner, 
we shall idealize the problem again by giving to the potential the 
shape shown in Fig. 31; 

= 0 for x < 0 and x > I (regions I and III), 

C/ = C/o for 0 < a: < i (region II). 
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In the more general case of Fig. 30, a process of successive 
approximations would give a result qualitatively analogous to the 
one which we shall find by giving the barrier the particular form 
of Fig. 31, made up of two potential steps’’ in opposite directions. 

Hence let us consider the three re¬ 
gions (I, II, III) s eparately. For regions 
I and III, the Schrodinger equation is 
the same as (148), which was studied in 
§35. Its general solution is of the form 
(149), but the constants occurring in it 
will in general be different in the two 
sections. In region II, the solution will 
have the same form except for the sub¬ 
stitution of £7 — r/o in the place of E. We can therefore write 

Region 1: u = Ai \ 

Region ll \ u = I (195) 

Region III: u =- Az + Bz j 

where we. have put 



u 



Uo 



1 

JL 

in; 

A 


j? 


0 I 


Fig. 31 


Let us suppose that a large number of particles are projected 
against the barrier from left to right. Then the squares of the 
moduli of the constants A and B will have the meaning already 
illustrated in §37. We must also place here R 3 = 0 to express 
the fact that no particles come from the right. The particles will 
evidently have the same velocity in regions I and III, and the 
transmission coefficient of the barrier (that is, the probability each 
incident particle has of overcoming it) will be given by 


, _ AsAl 

A,A* 


(196) 


We note that the five constants Ai, Bi, A 2 , B 2 , and As must 
be related by the condition that u is to be continuous, along with 
its derivative, at the points A and B. The real part of u (as well 
as its imaginary part) will be represented in regions I and III by 
two sine curves of wavelength X = 1/fc (since k is certainly real). 
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In region IT, however, the curve may have two different shapes 
according to whether the kinetic energy is above or below Uq, In 
the first case Uq is real, and hence the curve is sinusoidal also in 
region TI but with a wavelength Xo = 1/A*(, greater than in I and III 



(Fig. 32). In the second case /co is imaginary, and hence the curve 
in region II is of the exponential type (Fig. 33). In both of these 
cases the curves of the three regions join continuously, as shown in 
the figures. This implies a relation between the amplitudes of the 



two outer sine curves, whose analytic expression we shall find, lead¬ 
ing to the calculation of r. 

The continuity conditions on u and ^ for x = 0 give 


where we have put 
From this we obtain 


AI + J5i = .^2 + ^2, 

= h{A2 — B 2 ), 



Ai = 


+ m) + H^2(1 — m)« 


( 197 ) 
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Analogously, the continuity conditions for x = I give 
A 2 = A 3 

m' ' ’ 

from which we get 

Zli 

B. = A 3 

2 /i 


Substituting those expressions into (197), \ve obtain the following 
relation between .4 i and AI 3 : 


^1 = c2-^[(/X + 1)2 6-2-M - ~ 1)2 (198) 


From this we obtain r by means of (19G), or more conveniently 
its reciprocal: 


1 

r 



1 

lG|/i |2 


|(/X + 1)2 


~(m~ 


1^)2 ^2TiAro/|2_ 


(199) 


In order to calculate tliis expression, it is better to treat the two 
cases E > Uq and E < Uo separately: 

Case (a): £' > Uo. In this case ko and fj, are real, and hence 
the right-hand member of (199) may be written 


IGg^ + 1 )' 


(/X — 1)^ ^2irtM] 

X [(m + 1)^ ~ (jLt — 1)^ Q~2irikol'^ 


IGm 


2 [(m + 1)^ + (m - 1)^ - 2 (/x 2 - 1)2 cos 47rM. (199') 


From this equation we see that with yi fixed, the transmission coeffi¬ 
cient T varies in a periodic manner with a variation of I (thickness 
of the barrier). For cos 47 r/co? == 1 or 


^ ^ Wo (w integer), (200) 


the square bracket attains its maximum value equal to IC/x^, and 
hence 


T — Ttomx — 1 » 
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For cos 47r/coZ = — 1 and hence for 


we have instead 


7 = + i) 

2k, 



^0 


T — Ttnin 


4 / 1 ^ 
(1 + 


T always lies between these two limits. As can be seen from 
the last formula, r no longer vanishes and hence there is always a 
certain probability that a particle traverses the barrier. This 
probability becomes a certainty if the thickness of the barrier con¬ 
tains an integral number of half-wavelengths, as is indicated by 
(200); note the analogy of this phenomenon with the transmission 
of light through a thin plate. But in every other case r < 1, and 
hence there is a certain probability 1 — r that the particle will be 
repelled, in contrast to what would result from classical mechanics. 

Case (b): < C/q. In this case, which is the more interesting 

one as far as applications are concerned, k, and hence n are imagi¬ 
nary, so that we may write 


fco = ik' 




with A;' and / real. Therefore (199) becomes 


X [(1 - i/y - (1 + in’y 

= + ifiy - (1 - iix'r + 2(1 + cosh 4Tk'l\ (201) 

(1 4- -U n'2)2 

= 1 + ^ (cosh iirk'l - 1) = 1 + ^ y, < - sinh^ 2Tk'l. 

We see that once ju' has been fixed, the right-hand member 
steadily increases with increasing Z, and hence r steadily decreases 
with an increase of the barrier thickness (it becomes negligible 
when I becomes large compared with 1/fc')* However, we see from 
the formula that r does not vanish (of course, as long as Uq/E and 
I remain finite), so that there is always a certain probability of 
penetrating the barrier and of continuing beyond indefinitely, even 
for a particle with energy below U,, 
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Here apparently the same difficulty occurs which was mentioned 
in connection with the potential step, inasmuch as if we were to 
imagine the particle as actually moving continuously in the ordinary 
sense, we should be led to conclude that the particle has traversed 
a region in which its kinetic energy must be negative. But in 
reality, as we know, we are not entitled to speak of the motion of a 
particle between two successive observations. And if an observa¬ 
tion of position made us find the particle in the region where U > E, 
this would imply no paradox at all, since (as has already been 
observed) the very fact that we observed its position altered the 
value of its energy. 

The optical analogue of the phenomenon considered here is the fol¬ 
lowing. It is known that the special luminous disturbance which exists 
in the second medium during total reflection 
may be, so to speak, captured and trans¬ 
formed into ordinary radiation, by placing, 
at a very small distance from the reflecting 
surface, the surface of another medium 
equal to the one from which the light 
emerges. Thus if AMN (Fig. 34) is a glass 
prism on whose hypotenuse MN the total 
reflection takes place, upon approaching to 
within a very small distance (of the order 
of a wavelength) with a second glass prism 
A'M'N^, we see that part of the incident 
light passes into that prism and emerges 
from the face A'N\ This phenomenon 
occurs despite the fact that in the passage 
of light from the first prism to the air layer there results (according to geo¬ 
metrical optics) an imaginary angle of refraction. The air layer MNM'N' 
is perfectly analogous to the potential barrier considered above.® 

As far as a direct experimental verification of this result of wave 
mechanics is concerned, it must be remembered that this seems 
practically impossible, owing to the fact that the thickness I of the 
potential barrier would have to be of the order of a de Broglie wave¬ 
length, which even for the slowest cathode rays is always very short. 
Nevertheless, as we shall see in the next few sections, there exist 
experimental facts which may be interpreted as an indirect con¬ 
firmation of this remarkable phenomenon. 

• There is, however, the difference that with de Broglie waves the phe¬ 
nomenon occurs even at normal incidence, but not with light. 


A' 
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41. Note on the theory of cold emission of electrons. It has 

been found experimentally,that upon application of a strong electric 
field Z, in vacuum, to the surface of a metal, the field being directed 
toward the metal, electrons whose number is proportional to the 
quantity where 6 is a constant, will be induced to leave the 

metal. The considerations of the preceding section permit a simple 
explanation of this phenomenon which is partly quantitative. 

It is known that in order to remove an electron from a metal, a 
certain amount of energy Uq is required for the process of crossing 
the surface. Therefore we may say that the metallic surface repre¬ 
sents a potential step of height f/o, for the electrons so that, if we 
draw the x-slxis perpendicular to the surface of the metal, the poten¬ 
tial U of the forces acting upon an electron has the shape of the 
line AOBC in Fig. 35: it is the wall OB which prevents the electrons 

(whose energy is less than Uo) 
B _C from leaving the metal. 

I When the electric field is ap- 

^0 plied to the outside of the con- 

__ i, ductor, the part BC of the poten- 

A metal (j vcicaum tial is replaced by the sloped line 

BC' (whose slope tan a repre- 
Fis. 35 sents the force eX), Thus, in 

place of the potential step, a 
potential ''barrier’' OBD is formed which, according to classical 
mechanics, would be just as insurmountable for the electrons. But 
instead, as we have seen in the previous section, wave mechanics 
permits some of the electrons to penetrate this barrier and leave 
the metal. 

Fowler and Nordheim’ have developed a theory of this phenom¬ 
enon on the foregoing basis, accounting for it in a satisfactory man¬ 
ner, even quantitatively. We confine ourselves here to pointing out 
that the formulas of the preceding section permit us to foresee that 
the intensity of the emitted electron current will be expressed, as a 
function of the field Z, by a law of the indicated type. In fact, 
Fig. 35 shows that the width OD of the barrier is given by 

OD = Zl = 

tan a eX 


^ Proc. Roy, Soc. A 119, 173 (1928). 
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and that (ignoring the difTerence in shape of the barrier, which does 
not modify the aspect of the qualitative argument) if we identify 
OD with the thickness of the barrier considered in the last section, 
the quantity Avk'l occurring in formula ( 201 ) becomes 

ATk’l = 4t (Uo - E) § = ( 202 ) 

where the portion independent of is designated by h. Substituting 
values which occur in practical cases, we find that Mc'l comes 
out at least of the order of 100. Hence, applying formula (201), we 
may replace cosh Airk'l by and may neglect unity compared 
wdth this term. The formula is thus reduced to 


1 

r 


(1 4 - 

8 /* 


ph/X 


(203) 


which yields for transmission coefficient r, and hence for the number 
of electrons emitted, the proper expression suggested by experi¬ 
mental data. 

42. Particle between two potential barriers. Let us now con¬ 
sider the case in which the potential has the shape of Fig. 36, which 
may be considered to be built up 
of two symmetric potential bar¬ 
riers of height Uq, enclosing a 
central ^‘valley’'; the bottom of 
the latter may also be at a lower 
level than the external potential, 
as in the figure. 

From the point of view of FI 9 . 36 

classical mechanics, three cases 

must be distinguished. HE > Uo (for example, level E'), a particle 
will traverse the entire double barrier; if Uo > E > 0 (for example, 
level J 5 "), three types of clearly distinct motions are possible; ( 1 ) 
the particle comes from the left, will be repelled by the first barrier, 
and turn back; ( 2 ) the particle comes from the right and executes a 
similar motion along the positive x-axis; or (3) the particle finds 
itself between the two barriers and then performs periodic oscilla¬ 
tions. If 0 > > —Ui (level i?"')j only the oscillatory motion 

between the two barriers is possible. 




194 WAVE MECHANICS OF A PARTICLE [§42 

In order to study the problem according to wave mechanics, 
it is convenient again to idealize the potential by giving it the shape 
of Fig. 37. We then apply the same procedure as in §40, solving 
the Schrbdinger equation separately for each of the five regions into 
which the points A, (7, and D divide the a:-axis, and imposing the 
condition that u and du/dx be continuous at these points, which 
establishes a relation between the constants of the various regions. 
The result obtained may also be extended qualitatively to the case 
of Fig. 36. Its most remarkable result applies to the case where 
Ih > E > Q (levels of the type E^'). 

For a brief outline of this result, we shall start out with the 
obvious observation that if the two barriers became infinitely high, 


Ifo 

1 

i 




II 
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a particle in the central region would be subject to the conditions 
examined in §38 (particle on a line segment), and hence its energy 
could take on only certain discrete values Ei, E^, ... ; in other 
words, the energy would be ^^quantized.^^ But if we consider 
barriers of finite height, these no longer represent insurmountable 
obstacles for the particle; the energy is no longer quantized, and may 
assume any value. We find, however, the following result: ^ is 
evidently of an oscillatory nature in the central region (III) as well 
as in the two external regions (I, V), whereas it is of the exponential 
type in regions II and IV. If the energy has any arbitrary value, 
the amplitude of ^ in the central region will in general be consider¬ 
ably smaller than in the outer regions, because the exponential por¬ 
tions which join it approach the a:-axis toward the interior. Con¬ 
sequently, the particle has a very small probability of finding itself 
in the central region, and if found there it has a strong tendency to 
leave. Only in the case where the energy lies close to one of the 
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values JS'i, . . . (which in the case of infinitely high barriers are 
the only permitted values) does the amplitude in the central region 
become much larger than in the outer regions; that is, the particle 
has a large probability of being found inside. Hence we can say 
that instead of rigorous quantization, we are dealing with quasi 
quantizationj meaning that the discrete levels E\y E 2 , . . . are no 
longer represented by sharp horizontal lines but by narrow shaded 
bands. They also no longer represent the only permitted values of 
E; instead, they represent the values of E which give to the particle 
a considerable probability of being found inside the valley 
limited by the two barriers. 

For energy levels below 0 (type £"") we have ordinary quantiza¬ 
tion and \f/ is practically confined to the central region. On the 
other hand, the levels higher than the barriers, that is, of type E\ 
are not quantized at all and correspond to a ^ which is sinusoidal 
everywhere. 

43. Note on the theory of alpha-particle emission. The con¬ 
siderations of the last section constitute the basis for a remarkable 
theory devised by Gamow, and independently by Gurney and Con¬ 
don, in order to account for the spontaneous emission of a-particles 
from the nuclei of radioactive substances. 

This phenomenon gave rise to a serious theoretical difficulty. 
Indeed, it was possible to determine, by means of experiments of 
the type which had led Rutherford to the discovery of the atomic 
nucleus (scattering of a-particles), what force acts upon an a-particle 
in the neighborhood of a nucleus. It was found that this force is of 
the Coulomb type and repulsive, as long as the distance from the 
nucleus is not below a certain limit (of the order of 10“^^ or 10~^^ cm). 
In the cases in which it was experimentally possible to go below that 
limit (that is, for light elements) it was found that when the nucleus 
was approached, the repulsion increased less rapidly than was 
required by Coulomb^s law. This effect gave indication that this 
repulsion was to decrease further and finally to turn into an attrac¬ 
tion, as is required for the interior of the nucleus in order to ensure 
its stability. The potential U(r) will therefore be given by a curve 
of the shape shown in Fig. 38, with a maximum Uo corresponding 
to the distance ro at which the force becomes attractive. The 
of-particles which are part of the nucleus are within the region corre¬ 
sponding to the potential valley.'^ The case of greatest interest 
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is the one of uranium I(U I). For this element it has been possible 
to make an experimental determination of the curve only down to 
distances of about 3 X 10“^^ cm (solid portion in the figure), but 
this is sufficient to exhibit the paradox mentioned previously, 
which is as follows: the a-particles emitted by uranium I have an 
energy of about 4 million volts, which is much less (about one 
half) than the energy corresponding to the highest point M of the 
curve that was ascertained experimentally. The energy corresponds 
to a-particles which would have originated, with zero initial velocity. 


X10® Volts 



from a distance 6 X 10”^^ cm (point N of the curve). Therefore 
(even independently of any hypothesis concerning the shape of the 
curve in the portion not established by experiment) it remains unex¬ 
plained, from the classical point of view, how particles having an 
energy less than the maximum, I7o, of the potential barrier surround¬ 
ing the nucleus have succeeded in issuing from the nucleus. 

From the point of view of wave mechanics, however, this diffi¬ 
culty disappears, because, as we have seen, a particle may occasion¬ 
ally penetrate a potential barrier higher than its own energy. The 
a-particles of the nucleus behave qualitatively like the particles of 
the one-dimensional problem studied in the last section (from which 
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the prcvsent problem differs by having sjjherical symmetry, which 
does not affect anything essential); they will ordinarily remain in 
the interior of the nu(;leus, with certain quasi-quantized energy 
levels, but will have a certain probability of leaving it, passing, as it 
were, through the barrier but keeping the same energy. This effect 
is in perfect accord with the experimental fact that the a-particles 
emitted from a given substance all have certain well-defined energies 
(that is, they have a ‘Mine spectrum”) which just represent the 
energy levels in question. In practice these energy levels turn out 
to be almost perfectly quantized, the width of the bands being 
altogether negligible. A quantitative determination of these energy 
levels would require an exact knowledge of the potential curve, of 
which, however, w^e know only the qualitative behavior for the 
interior. Fortunately, this exact knowledge is not necessary for a 
calculation of the transmission coefficient of the barrier, that is, the 
probability that a particle will penetrate it, upon which probability 
evidently depends the mean life of the radioactive element 
considered. 

In this way a simple relation® between the mean life of the ele¬ 
ment and the velocity of the a-particles emitted was found. This 
relation has already been observed empirically by Geiger and Nuttall 
for all radioactive substances emitting a-particles. 

* See No. 29 of the Bibliography. 
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Three-dimensional Problems 


44. Particle subject to no forces. In this chapter we shall study 
the wave mechanics of a particle, removing the restriction of the 
preceding chapter that all quantities depend only on x and t] hence 
three spatial coordinates will now enter, in addition to the time. 

The first problem treated here is that of the free particle in 
space with no forces present. Its most general yp may be obtained 
(see §29) as a sum of monochromatic components each of which 
corresponds to a definite value of the energy E, Hence we shall 
look for these monochromatic solutions, of frequency v = E/h — 
that is, of the form (128) of §27. 

Again let us take up equation (131'), which u(Xy y, z) satisfies, 
writing it with A expressed explicitly and with f/ == 0: 


, dhi 

^2 + ^2 + 


^ . 

dz^ 


nr 


Eu == 0. 


This well-known equation may be integrated by the method of 
separation of variables, that is, by letting 

.u{x,y,z) ^ X{x)Y{y)Z{z), (204) 

The equation then becomes (after we divide through by XYZ)\ 


If we observe that each of the first three terms depends on only 
one of the coordinates, we realize that, in order to have the equation 
satisfied for any x, y, z, we must have 

•yn yn yff 

^ ^ = C2, ^ = ca, (206) 
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Cl + C2 + C3 + 


8ir^m 


E = 0. 


(206) 


The first equation of (205) has a general solution of the type 


X = ai + 61 

which would become infinite either for x = 4- or for a; = — oo^ 
if Cl were positive. However, since it is required that u be finite 
everywhere, we conclude that ci < 0, so that in the expression for X 
the exponents will become imaginary (or zero) and X will be of an 
oscillatory character (or constant). The same arguments hold for 
Y and for Z. We may therefore put (indicating three new con¬ 
stants by fcx, /cy, kz) 


C2 = -“47r^A:y, C3 == 


Thereupon (206) becomes, after we set + fcy + = fc, 


k^ = 


2mE 


(207) 


and the general solutions of equations (205) may be written 


X = ai + 61 ) 

Y ~ a2 + 62 [ 

Z = as + bz ] 


Each of these functions is of the same form as the u found for the 
one-dimensional problem [§35, formula (149)]. As we saw for that 
case, we are able to consider only the first terms (the second terms 
are obtained by changing the sign of /c,, ky^ A:*), and then u takes 
on the form 


and yj/ becomes 


= A g2iri(fc*x+/:„y-fA:,*) 

(208) 

yp z=z Al g2ri(A:,x-f*:yV+fc»*—>><) 

(209) 

v = \e^ 
h 2m 

(207') 


where [see (207)] 
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It is convenient (as in §15) to introduce the vector k with 
components A;*, ky, kz (and hence of modulus k) and the vector r 
originating at the origin of the coordinate axes and ending at the 
point P where ^ is being calculated, r therefore having components 
X, y, z. Now the preceding formula is written in invariant form, 
that is, independent of the axes: 

^ = A (210) 

This formula is identical with (78') of §15. As we have seen, 
(78') represents a plane wave train having k for its ‘'propaga¬ 
tion vector,and hence a wavelength X = 1/A: and a direction of 
propagation given by the vector k. Therefore the phenomenon is 
physically the same as the one studied in §35 but is now referred 
to arbitrary axes. Consequently, this wave train represents a 
particle whose momentum p is exactly determined (p = /^k), and 
whose energy is exactly defined as 

E ^ k\ 

2m ' 

while its position is totally indeterminate. If k and v in (210) are 
expressed in terms of p, the expression becomes 

^ = .4 C * V’”' 2m) (210') 

We now proceed to consider the most general solutions, obtained 
by superimposing infinite wave trains like the one above, but with 
different propagation vectors; these are the solutions analogous to 
(79) of §15. Expressing k in terms of p in (79) and calling ^o(p) 
the function h^A{k) (proportional to the amplitude of the wave 
train of momentum p), we obtain the following expression for 

—Y \ 

^ = /i~^///y?o(p) e ^ 2m (211) 

or else, putting 

<p(P,t)=MP)e (212) 

2irt 

iff = h-^Jf f e * *’ ^dp;, dpy dp,. (213) 

For ^ = 0, ^ becomes 

1^0 = h-^ffJMp) e ^'’’ "dp, dp^ dp,. 


(213') 
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Remembering that A of (79) is given by (80), we see that is 
obtained from the initial wave function by the formula 

2iri 

Mp) = dy dz; (214) 

hence we get (p from \f/ by means of 

2W 

~ e ^^^dxdydz, (215) 

Note the analogy between expressions (213) and (215), which may 
be considered each other’s inverse and in which the functions ip and ^ 
play symmetrical roles. 

Recalling the principle of superposition, we may interpret the 
solution (213) in the following manner. When the state of the 
particle is represented by (213), if we perform a measurement of 
the momentum, there is a probability 

W(\'^dpxd'pyd'Pz = \ip\^dpxdpydpz 

of finding the components of the momentum lying between 'Px and 
Vx + dpxy Vv py + dpyy pz and pz + dpz. The function <^(p), as 
far as the measurements of momentum are concerned, has the same 
significance which ^ lias with respect to measurements of position. 

It is hardly necessary to point out that if xpo given by (213') is 
different from zero only in a limited region of space, it will represent 
a wave packet which moves approximately with uniform rectilinear 
motion; that is, it moves like a particle in classical mechanics, spread¬ 
ing out gradually in three dimensions, however, as may be seen by 
generalizing the calculation of §3G. 

In this case too, it may be verified that wave mechanics con¬ 
tains within itself the uncertainty principle (thus generalizing what 
has been said in §36 in connection with the one-dimensional prob¬ 
lem) ; it suffices to take expressions (83) of Chapter 5 and to express k 
in terms of p, thus obtaining 

Ax Apx > I 

^ j 

Ay > (216) 

Az Apz > ) 

Air j 

which embody the uncertainty principle for a particle in space. 
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45. Particle in a rectangular box. In analogy to what we did 
in the one-dimensional case (§38), we may now briefly consider the 
case of a particle constrained to remain within a rectangular box of 
sides a, h, c, with perfectly reflecting walls. No forces are present 
in the interior of the box. Let us first look for a stationary solution 
corresponding to a given value of or to a single frequency, being 
guided by the analogy with the problem of light or sound waves 
in a box with reflecting walls. 

It is clear that a system of plane waves propagating in a direc¬ 
tion with direction cosines a, 7 , upon reflection will give rise to a 
system of waves characterized by the direction cosines ±a, ± 7 , 

from the walls (where the signs may be combined in all eight pos¬ 
sible ways). Therefore we cannot use a solution like (209) by 
itself but instead must take a sum of eight terms, of the type 

be * ^ 2 ri(±kmX±kvV±k,g) ^ (217) 

But now we must have u = 0 along the walls for all L This con¬ 
dition leads to a restriction of the arbitrariness of the propagation 
vector k. In fact, let us suppose that two walls correspond to the 
planes x = 0 , a: = a, so that the above-mentioned sum may be 
written in the form 


u = fiiVj Zy t) + /2(y, Zy t) 

we see immediately that to satisfy the condition w = 0 for a; = 0 
and for x = a for any y, z, tj we must have 2 irk 3 fii, = niTr, with rii an 
integer, and similarly for ky and kg. Hence 
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and since fc = 1/X, it may be concluded that the only wavelengths 
which may give rise to stationary waves are the ones expressible 
by the formula 
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E 


ninm* 


^ 4- !L2 -L ?L»Y 

8m \a^ 6^ cy 


( 220 ) 


Thus the system has a triple infinity of discrete energy levels 
characterized by three quantum numbers ni, 712 , ns. 

Analogously in acoustics, a rectangular cavity may resonate 
only for certain definite frequencies {proper frequencies), to which 
there correspond wavelengths given exactly by (219). 

In the more general case of a particle enclosed in a cavity of any 
shape, the problem is always analogous to the acoustical problem 
of finding the proper frequencies, or resonance frequencies, for which 
stationary waves may be set up in the interior of the cavity; this 
problem always leads to discrete energy levels. 

46, Central forces (general). Let us now consider a particle 
subject to the action of a central force. It will evidently be con¬ 
venient to use the polar coordinates r, d, (p, having their origin 
at the center of force, and the potential U will be a function of r 
alone. 

The Schrodinger equation (127), with the explicit expression 
for the operator A in terms of polar coordinates, is written 


1 d\ru) _ 1__ ^ M . _1 __ ^ 

r dr^ r^ sin 0 SO \ ^ ddj sin- 6 d(p^ 

+ ^ {E - U)u = 0. (221) 


Let us try to separate the variable r from 0 and <p by setting 


u = Rir)Yie, v). (222) 

Substituting into the preceding equation and multiplying through 
by r^/RY, we obtain 


r d\rR) , 1 d ( . 

R dr^ Y sin 0 30 V ^ 30/ F sin’* 0 5^* 


+ 




r\E - U) = 0 . 
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In this equation the first and last terms depend only on r, the 
other two terms only on d and ip. Hence the equation is separated 
into two: 


1 a / . 

Y sin d de 


dY\ 1 54" 

dd)'^ Y sin2 e 


(223) 


- -U) =C, (224) 


R dr^ 


where C is a constant. The first equation does not contain the 
function V{r) and is therefore common to all problems with central 
forces. Hence in (222) the factor Y{d, cp) is independent of the law 
according to which the force varies as a function of r, whereas the 
factor R{r) does depend on that law and is therefore different in 
various problems. In this section Ave shall study equation (223), 
which holds for all problems Avith central forces. It may be 
written 


1 d 
sin d dd 



+ 


1 


sin^ d dip- 


—, + CY = 0, 


(223') 


and it may be showm that the equation has solutions which are 
finite, continuous, and single-valued for all directions only if 


C = /(Z + 1), 


(225) 


with i = 0, 1, 2, . . . . The integer I is called azimuthal quantum 
number because it corresponds to the azimuthal quantum number 
of the Bohr-Sommerfeld theory. With the expression (225) for (7, we 
recognize in (223') the differential equation for spherical harmonics 
of order 1. 

Upon further separation of the variables 6 and <p by the 
substitution 

Y{6, ^) = e(^) * $(^), (226) 

the equation again splits into two: 


dip^ 


-f- X# 


0 , 


1 d 
sin Q dd 



sin* 6/ 


0 = 0 , 


(227) 

(228) 


where X is a constant. 
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Equation (227) gives 

# = const, (229) 

Since ^ must be periodic in <p with period 27r (otherwise u would not 
be single-valued for every point of space), y/\ must be equal to an 
integer^ m; that is, 


X = m2. (230) 

Then, if we determine the constant so as to make 

(231) 

(229) becomes 

(229') 

V 27r 

where the exponent has been written without the double sign, with 
the understanding that m may be positive, zero, or negative. The 
integer m (with its sign) is called magnetic quantum number. 

In order to study (228) it is convenient to take for our variable 
cos d, which we shall indicate by x. Then the equation becomes, 
with (230) taken into account, 

+ (232) 

The eigenvalues and eigenfunctions of this equation are investi¬ 
gated by a method analogous to that followed in §39 for the oscil¬ 
lator. The singular points of the equation are a: = ± 1 and a; = qo , 
as may be readily recognized by dividing through by (1 — x^) 
(see §16); and we note that we are dealing with singularities of the 
Fuchsian type. Since in our case, x ranges from —1 to +1, we are 
interested only in the singularities at the endpoints of this interval. 
By the method of the indicial equation, we find that in the neighbor¬ 
hood of X = +1, 0 may be put in the form 

(1 - x)*V,(l - a;), 

^ We shall adopt m in order to conform to universal usage, although m also 
designates the mass of the particle. 
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where fi is .a holomorphic function which does not vanish for 
1 — x = 0. Of the two forms, the one with the negative exponent 
has a pole for 1 — a: == 0 and hence is to be discarded. Therefore 

I ml 

0 = (1 — a:) ^/i(l — x). Similarly, the other singular point gives 
M 

0 = (1 + x) ^/ 2(1 + x)f with /2 holomorphic and nonzero at 
1 + X = 0. Since these two expressions must represent two differ¬ 
ent expansions of the same branch of the continuous function 0 , 
we must be able to write 

W 

0 = (1 - x^) 2p(x), (233) 

00 

with P(x) = ^ arX^. (234) 

r»0 


Substituting (233) into (232), we find for P the equation 

(1 - x 2 )P" - 2 (|m| + l)xP' + [C - \m\(\m\ + 1 )]P = 0; (235) 

and when trying to satisfy (235) with the series (234), we find for 
the recursion relation 

(r -f- l)(r + 2 )ar +2 

= [r(r - 1) + 2 (|m| + l)r - C + |m|(|m| + l)]ar. (236) 

Hence we may take for P a series in even or in odd powers, with 
arbitrary first coefficient. 

The condition that P be finite over the whole interval under 
consideration is certainly met if one of the coefficients, ay +2 for 
example, vanishes together with all succeeding ones, and the series P 
reduces to a polynomial of degree y. The condition for ay +2 = 0 
(ay being 9 ^ 0) is, as may be seen from (235), that 

C = 7(7 - 1 ) + 2(\m\ + 1)7 + |?n|(|m| + 1 ) 

= (y + kl)(7 + \m\ + 1); 

and if we put 

7 + |m| - i (237) 

(from which I comes out to be a nonnegative integer '> in)^ we 
obtain (225). 
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It may also be shown that this condition is not only sufficient 
but also necessary ;2 that is, if P does not reduce to a polynomial, it 
cannot satisfy the desired conditions. 

Legendre 'polynomials. Let us first of all consider equation (235) 
for m = 0, in which case Y does not depend on ip and coincides 
(within a constant factor) with 0 , and the latter with P. With 
(225) taken into account, the equation is written 

(1 - x^)P" - 2xP' + l{l + 1)P = 0, (238) 

and the desired solution is a polynomial P{x) of degree I, satisfying 
the recursion formula (236). These polynomials, which were known 
for a long time in the theory of spherical harmonics, are defined as 
follows. Choose Ui = 0 if Z is even, and uo = 0 if Z is odd, so that 
in the first case the polynomial has only even powers, in the second 
case only odd powers. The other coefficient is chosen in such a way 
as to make P^l) = L The polynomial Pi{x)j defined in this 
manner, is called Legendre polynomial of degree L When expressed 
in the form Pi(cos 6 ) it is also called zonal harmonic (in fact, because 
of what has been said above, this is a particular case of spherical 
harmonics having axial symmetry). It may be shown that all the 
I roots of the polynomial are real and lie between —1 and +1 
(endpoints included). Following are the expressions for the first six 
Legendre polynomials: 

Po(:c) = 1 Pi(a:) = x 

P2(^) == Hx^ — V 2 ~ 

P,(x) = + % P,{x) = 63/^5 _ 35/^3 + 


Since these expressions are eigenfunctions of (238), they consti¬ 
tute a system of orthogonal functions in the interval ( — 1 , + 1 ). 
They are not normalized, however, since 


P^{x) dx 


2 

21 + 1" 


These expressions may also be defined by means of the Zth 
derivative of the expression {x^ — 1 )^ In fact, we have 

* See Bechert, Ann. d. Phyeik 88, 906 (1927). 


(240) 
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Another one of their important properties is expressed by the 
recursion formula which relates three successive polynomials: 


which permits us to calculate successively all the Legendre poly¬ 
nomials starting with Po and P\. 

Associated Legendre functions. Let us proceed now to a con¬ 
sideration of (235) without the restriction m = 0. It may be 
written, taking (225) into account, as 


(1 - a:2)P" - 2(\m\ + l):rP' 

+ [Z(Z + l)-|ml(|ml + l)]P = 0, (242) 

with \m\ < Z. 

Its eigenfunctions may be obtained by means of the recursion 
formula (236), but it may also be noted that if we take the deriva¬ 
tive of the last equation with respect to x, we obtain an equation 
of the same form, in which instead of P, P', P" there occur P', P", 
P'", respectively, and where instead of \m\ there is (|m| + 1), so 
that if Pi satisfies (242) for \m\ = 0, as we have seen, P[ will satisfy 
it for \m\ = 1, PJ' for |m| = 2, and so on. Therefore (242) is satis¬ 


fied by 


rS which is a polynomial of degree Z — |m| = y, and 


hence (232) is satisfied by taking 0 equal (or proportional) to 


Pr(x) = (1 


w rm\p 
r2^ 2 rl-JLi. 


These functions are called associated Legendre functions. They 
are of course orthogonal over the interval ( — 1, +1) but are not 
normalized, since 


{PrYdi 


(I + |w|)! 2 

(/ - |w|)! 2 / + l" 


Thus, in order that 0 also be normalized (with respect to the 
variable x), it will suffice to take for its expression 


|m|)! 2 / + 1 


((Z + H)! 


Pr{x) 


Hi — Im]) 1 2i + 1 
f(l + lmj)! 2 


d(cos 0)l“l ’ 


(243) 
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and we then get sin 6 dd = 1. (244) 

To summarize, to each eigenvalue C = l{l + 1) of (223') there 
correspond 2^+1 eigenfunctions Yi„{6, <p) (with m. = —I, —f + 1, 
. . . 0, 1, . . . 0; given by (226), (229'), (243), that is, by 


Yi„ 


c’”*'" sinl’”l 6 

Ni„ 


rfl^iPdc os 6 ) 
d{cos 0)W ’ 


(245) 


where we have placed in evidence the normalization factor 


1 _ 1 I (I - H)!2Z + 1 


(246) 


Yim thus normalized has the property, as a consequence of (231) 
and (244), that 

J/|F,„P sin = 1, (240') 

where the integral is extended over the whole spherical surface. 

These functions are particular spherical {surface) harmonics of 
order 1. Of these, the harmonic corresponding to m = 0 reduces to 


Yio 


7 ^ P,(cos 9) 

A iO 


(247) 


it does not depend on v?, and hence has axial symmetry. 

Furthermore, we note that to the eigenvalue zero {I — 0) there 
corresponds only the eigenfunction 


Foo 



L = 1 

N 00 2 v^ 


(248) 


that is, a constant. Hence u in tliis case depends only on r (spherical 
symmetry). 

We shall list here, for the convenience of the reader, the explicit 
expressions of the spherical harmonics corresponding to the first 
four values of I which occur most often in atomic mechanics. We 
shall let m take on only the values 0, 1, 2, 3; to obtain the functions 
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corresponding to the negative values of m, it is necessary only to 
change the sign of the exponent of e. 


Yoo — 


2Vt 


Fio = 


cos Fii 


I-.. - -yjS (i'”’ » - O' 


sin 6 


sin 6 cos 6 


F 22 = 


1 15 . 


4 


sin^ 6 


I,os^e-^cos0 


F3I — T 


1 21 . 


4 \4t 


sin ^(5 cos^ 0 — 1) 


,, 1 [105 . 


sin^ $ cos 0 Fsa = 


1 35 . 


4 \47r 


sin® 9 


We should add now, anticipating a result which will be proved in 
Part III, that the azimuthal quantum number I and the magnetic 
quantum number m have the following physical significance: the 
modulus p of the angular momentum of the electron (moment of 
momentum) with respect to the nucleus is 

V = vurri) 

and the angular momentum component of the electron along the 
z-axis is m(/i/2x). 

47. The radial equation in the problem of central forces. We 

shall now deal with the factor R{r) of (222), which depends on the 
force law. This factor satisfies equation (224), where C is to be 
replaced by Z(Z + 1). By setting 

y(r) = rR(r), (249) 

we may write this equation 


d^y , 

» I L9 


(F m 


y = 0. 
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Upon 2 /, we must furthermore impose the conditions that for r = 0, 
R is to remain finite or have a pole of order < 1, so that 

y(0) = 0, 

and that for co, 72 is to tend to zero not less rapidly than 1/r, 
and hence that y is to tend to a finite limit or to zero. 

These conditions lead, when the potential U{r) is specified, to 
the determination of a sequence of eigenvalues for E (in general 
partly discrete, partly continuous) which represent the energy levels 
of the system. If they are discrete, they may be designated by an 
index n (principal quantum number) in addition to the index I which 
already occurs in (250), so that we shall write Eni, and for the 
corresponding eigenfunctions, Rni- As far as the normalization of 
these is concerned, we note that u(r, 6, ip) must be normalized in 
such a manner that 

jo’ sin » = 1; (251) 

since u = R(r)(d(d)^(ip) and since the two last factors are already 
normalized in accordance with (231) and (244), R must be normal¬ 
ized so that 

Udr = 1. (252) 

We note that although u depends on three quantum numbers 
(n, Z, m), the energy levels depend on only two of them, since the 
magnetic quantum number m does not enter. Each of these levels 
is thus a multiplet of order 2Z + 1 (degeneracy; see §6), since there 
are as many independent solutions for w, corresponding to the values 
which m can take (from — Z to +Z). 

Finally, it should be pointed out that in (250) we may also 
incorporate the term l(l + \)/r^ into the potential, considering as 
new potential the function 

whereupon the equation takes the form 


8ir*m 

~W 


(E - Ur)y = 0 , 


(254) 
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identical with the equation for one-dimensional motion under the 
action of a potential Ui, This form may be formally interpreted 
by saying that to the force deriving from the potential U there is 
added the force deriving from the second term of (253), which has 
its analogue in the centrifugal force of classical mechanics and 
likewise depends on the angular momentum, that is, on the azi¬ 
muthal quantum number L 

48. Theory of hydrogenlike systems. The eigenvalues. Let 

us now apply the theory of central forces to the case of the hydrogen 
atom and of other so-called hydrogenlike systems, formed 
(stie §14, of Part I) of a nucleus of charge Ze and of a single electron 
(the case of hydrogen corresponds to Z = 1). In view of the 
preponderant mass of the nucleus, we may consider the latter as 
fixed^ and therefore the electron to be subject to a central attractive 
force equal to Z Hence the potential is 


[/ = - 



Whatever has been said in general in §40 concerning the depend¬ 
ence of t/' on the coordinates 6 and (p, may be carried over bodily to 
the present case, since it is independent of the force law. We must 
only specialize the radial equation (250), which becomes, with the 
present expression for Uj 


~dr^ _ h“ r 


l{l -f- 1) 

J.2 


y = 0. 


(255) 


In order to reduce this equation to a simple form, it is convenient 
to let the constant term in the brackets equal ± l/r^, where the 
plus sign holds if > 0 and the minus sign when E < 0, and 
where ro has the dimensions of a length and is given by 


n 


4 . 




8Tr^m\E\’ 


(256) 


furthermore, it is convenient to take for independent variable 


2-=®, (257) 

ro ’ 

* A more rigorous treatment, taking the motion of the nucleus into account^ 
will be given in §21 of Part HI. 
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dx^- 


4 X 




(258) 

(259) 


The equation has a singularity of the Fuchsian type for :r = 0 and 
a non-Fuchsian singularity for x — We are to find the values 
of the parameter A for which this equation has solutions satisfying 
the boundary conditions [y(0) = 0 and y not infinite for x—> oo], 
and the corresponding eigenfunctions. At this point it is perhaps 
best to treat the two cases < 0 and E > 0 separately. In the 
first case, we must take the minus sign in (258), in the second case 
the plus sign.^ 

Case 1: E < 0. The asymptotic behavior of the solutions of 
(258), for a: —> oo, may be obtained by considering that the equation 
approaches the form 


d^y 



of which two fundamental solutions are 


y = 

Discarding the solution with the plus sign, which tends toward 
infinity, we are led to search for solutions which at infinity behave 
as This suggests that we put 

y = (260) 


With this substitution, the given equation is transformed into the 
following equation for v{x): 


v" - v' + 


W+l)l n 


which for the finite region of x does not have singularities other 
than the one at x = 0 (Fuchsian type). For this singularity 
the indicial equation is 

a(a — 1) —" l(l +1) =0, 


^ For a more complete treatment, see, for example, No. 1 of the Bibliography, 
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from which a = —Z, or a = Z + 1. Discarding the negative solu¬ 
tion, which would render v infinite at the origin, we are left with 


V = x^'^^(a(x)y 

where a;(:r) is of the form 


CO 



«-o 


(262) 

(263) 


Substituting (262) into (261), we find that co must satisfy the 
equation 

^ _ 1 j <. = 0. (264) 

Then, substituting (263) and equating the coefficients of each power 
to zero, we find for a, the recursion formula 


_ ^ s-f-Z-f-1 — A. 

(s + l)(s + 21 + 2) 


(266) 


In order that the series reduce to a polynomial* (whose degree 
we shall indicate by n'), we must have = 0, 7^ 0. Hence 

n' + Z + 1 - A = 0, 


from which we get that A must be equal to the integer 


n = n' + Z + 1. (266) 

Finally, solving for \E\ from (259), we find that if E is negative, it 
must have one of the eigenvalues 


Er. = 


27r2mZ2 


(267) 


These are the same energy levels as given by the Bohr theory 
(see §16 of Part I), in perfect agreement with experience, as we 
have seen. The integer n corresponds to the principal quantum 
number, and the energy depends on it solely. 

We note that with the aid of (267) the expression (256) for Tq 
becomes 

n = no,, where o, = (268) 

* If it does not, it can be shown (see e.g. No. 10c of the Bibliography, section 
29b) that u> behaves asymptotically as e®, and therefore y does not meet the 
required condition at infinity. 
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It is useful to mention that the constant ai defined in this manner 
is equal to the radius a\ of the first of the circular orbits furnished 
by the Bohr theory [formula (16), page 38]. 

Case 2: > 0. In this case equation (258), for x tending 

toward <», approaches the form 

g + |!<- 0, (269) 

whose general solution is 

Cl 

which is always finite for a:—> oo. Hence any integral of (258) 
will remain finite for a:oo. Therefore we are not forced to impose 
any limitations upon A] that is, any (positive) value of A is an 
eigenvalue of (258). Hence any (positive) value of E is an eigen¬ 
value of (255) {continuous spectrum of eigenvalues). The fact that 
the energy may have any positive value is in perfect agreement 
with what is found in the Bohr-Sommerfeld theory. In fact, the 
positive values of E correspond to hyperbolic orbits, and hence 
neither to periodic, nor to multiply periodic motions, which are 
therefore not to be quantized.® 

If we proceed as in the previous case, setting 


y = e±‘"^/2^(x), (270) 

we are led to the equation for t; 

r" ± iv' + ^ ^ = 0, (271) 

and setting, as before, 

00 

V = ^ ^ (272) 

a — O 

we find for a. the recursion formula 


_ —A + i(l + g + 1) 

®*+i (21 +s + 2)(s + 1) 

® We might even say that these states do not represent a true hydrogen 
atom but only the combination of an electron and a nucleus which, after 
approaching each other slightly, recede from each other indefinitely. There¬ 
fore, these solutions are of interest in collision theory. 
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As can be seen, since A is real, it is impossible for the numerator 
of this fraction to vanish, and therefore the series no longer reduces 
to a polynomial. The solution is a linear combination of the two 
contained in formula (270), namely, 


t 


,x* + 




(274) 


in which the coefficients of the second sum are the complex conju¬ 
gates of the coefficients in the first sum, so that y turns out real. 
It may easily be verified that these series converge absolutely for 
any value of x. 

49. The eigenfunctions of hydrogenlike systems for E < 0. 

We shall now deal with the form of the eigenfunctions for the case 
where E < 0 (which is the most interesting one since it corresponds 
to a stable system). Since the factor depending on 6 and <p (that is, 
the spherical harmonic Yim) was already discussed in §46, being 
common to all problems with central forces, there remains the 
factor R{r) to be examined. Recalling the steps (249), (257), (260), 
and (262), we have 


R{r) = ^ = A e-^'h>{x) = - 
r rox ro 

where aj(x) is a polynomial of degree n' satisfying the differential 
equation (264), which we shall now write in the form 

xco" + (2; + 2 ~ xW + n'w == 0, (264') 

and which is determined by the recursion relation (265), once the 
first coefficient ao has been fixed at will. The study of these 
polynomials is facilitated by the fact that they are related to a 
class of functions which have been studied for a long time and 
which possess important properties: the Laguerre polynomials. 

Note on the Laguerre polynomials. The Laguerre pol 3 aiomial of 
degree K, which is indicated by Lic(x), is defined by the formula 
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Following are the explicit expressions of the first few Laguerre poly¬ 
nomials : 

Lo ~ 1 

Li = 1 — X 

La = 2 - 4a: + a:" 

La = 6 - 18a: + 9x^ - a:» 


The interest of these equations lies in the fact that they are 
solutions of a well-known differential equation, as may be seen in 
the following manner. Letting 


such that 


u(x) = X^ 6“*, 
Lk == 


(276) 

(275') 


where by we denote, as we shall always do systematically, the 
^Cth derivative of ?/, we have, upon a first differentiation of (276), 


= {K — x)u. 


We differentiate this relation another {K + 1) times; for this 
operation it is useful to note that the known formula of Leibnitz 
for the nth derivative of a product, for the case in which the product 

d” 

is of the form x<p{x)j reduces to ^ (x(p) — X(p^^^ + We 

then obtain for u the following differential equation: 

+ (1 + x)n(^+i> + {K + = 0. 

Substituting the obtained from (275'), we get the charac¬ 
teristic equation for the iiCth Laguerre polynomial: 

xV' + (1 - x)L;, -k KLk = 0. (277) 

It is to be noted that the Laguerre polynomials are not eigen¬ 
functions of this equation, nor are they orthogonal. However, 
as may be shown® by successive integrations by part, utilizing (275), 
they have the property 

Lk(x)Lk'(^) dx = ^kk*, (278) 

® For this and other properties of the Laguerre polynomials, see, for instance. 
No. 25 or No. 34 of the Bibliography. 
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which may be expressed as follows: the functions e~^^^LK constitute 
an orthonormal system in the interval 0 to oo. 

Another important property of the Laguerre polynomials, which 
we shall merely mention here, is the one expressed by tlie recursion 
f ormula 

Lk-,1 + (2/1 + 1 ~ x)Lk + 1C'Lk-i = 0. (279) 

Associated Laguerre 'polynomials. By differentiating equation 
(277), w^e obtain 

xl;/' + (2 - x)U;, + {K - 1)14 - 0 (280) 

which may be considered as an equation of the second order in the 
function that is, This one then satisfies the equation 

+ (2 - :r)LiP' + (if - l)Lf = 0, 

wdiich is the analogue of (277), except that the second coefficient 
is increased by 1 and the third is decreased by 1. 

To this equation we may again apply the same procedure. 
Differentiating j times, we find that the function that is, the jth 
derivative of Lk, satisfies the equation 

xur + (i + 1 - x)Ujl' + {K - j)L4 = 0. (281) 

This function, evidently a polynomial of degree {K — j), is 
sometimes called associated Laguerre polynomial and is defined by 

We note that for j > K the polynomial 14^ is identically zero. 

The associated polynomials correspond to a given superscript j 
and to different /^'-values, and when multiplied by give rise 

to functions which are orthogonal over the interval 0 to oo. Specifi¬ 
cally, we have 

[0 (for K ^ if') 

jo Lfix)Lf,ix) = | ^ ^ ^ (282') 

We now note that equation (264'), which w satisfies, is identical 
with equation (281) of the associated Laguerre polynomials, pro¬ 
vided that we take 

j+ 1^21 + 2, K-j = n'; 
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that is, 

j = 2Z -f- 1, = 2Z -f- 1 -f- 

On the other hand, w too is a polynomial of degree n* just like 
Hence they can differ at most by a constant factor, which 
we shall designate by Therefore we write, placing in evidence 

the fact that a? and hence ll depend on the indices n and Z, 

(283) 

nl 


Jim = »(x). (28 

The factor N'l is determined from the normalization conditio 


(252). We find 


= ro 


n[(n + Z)!]^ 


^ (n-l- l)r 

so that the explicit expression of Rni as a function of r is the fol¬ 
lowing, where we have put for ro the expression (2G8), to show more 
clearly its dependence upon n: 




M- “ (=) V^)' 

The explicit expressions corresponding to the first few values 
of n and I are, setting p = r/a, 


Rio = 

V2ay^ 




Rn = -7=-p, 

2 V6 


7230 -- 

3 V3 


7281 = -—— 

27 Ve 

D 4 _ , 

7232 ~ -7=— C 

81 \/30 


(‘-i' + l-’) 


g-p/3 p2^ 


Equation (285), together with the expression already found for 
Yim [see formulas (245) and (246)], permits us to write down the 
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complete expression for the eigenfunction corresponding to the 
quantum numbers n, I, m : 


'Unln 




(/ -- |m|)!(n - I - 1)!(2Z + 1) 


Tra^n^ • {I + \m\)\[{n + l)\Y 




d}^ 


‘'Pi (cos a 


d(cos 0) 


Iwl 




We observe that two eigenfunctions corresponding to values 
of m which are equal but of opposite sign differ only in the sign of 
the exponent tm<p^ and hence are complex conjugates. 

It is interesting to get an intuitive picture of the distribution 
of the function uu* around the nucleus—that is, of the “probability 
cloudof the electron—as given by the preceding formulas. 

First of all, we recall [see formula (248)] that in states in which 
1 = 0 (s-states) u does not depend on ^ or but only on r. Thus 
we have a spherically symmetrical cloud. In particular, the state 
corresponding to the lowest energy level, or ground state, is such 
a state. In fact, n = 1 necessarily implies I = 0 (and m = 0). 

In that case, from the expression for Pio and from (248) we 
obtain 

u = — 

-v/xa^ 

and hence \u\^ = —^ 


The density of the cloud decreases exponentially toward infinity. 
It is represented by Fig. 39a, in which the blackening at each point 
is approximately proportional to the probability density \u\^ of 
finding an electron there. Hence we can say that the atom has no 
definite contour but, strictly speaking, extends over all space, since 
there exists a certain probability of finding the electron at any 
distance from the nucleus. However, as can be seen, this proba¬ 
bility becomes rather small as soon as r reaches a value two or three 
times a (for example, the probability of finding the electron outside 
of the sphere of radius 3a is about 0.062). In that sense we may 
say that the dimensions of the atom in the ground state are of the 
order of a, that is, of the same order as predicted by the Bohr 
theory (radius of the first circular orbit). 
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Fi3- 39* 



Fig. 39t> 
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Analogous arguments may be made concerning the other states. 
Those for which Z 7 *^ 0 do not have spherical symmetry; however, 
even for these it is of interest to consider R{r), or better R^{r) 
since, once we have fixed a straight line starting at the nucleus 
(that is, fixed 6 and <p)j the density along that line varies propor¬ 
tionally to R^, In Fig. 40 the curve R(r) is shown for the states 
n = l,n = 2 , n = 3. Also of interest is the function 2(r) — R‘^{r) -r^, 
whose meaning is the following: the probability of finding the elec- 



Fis. 40 

tron within the infinitesimal spherical shell of radius r and thick¬ 
ness dr is evidently 

drJ/lt/1 V sin 6 dd d^p, 
and because of (246') this reduces to 

dr = z(r) dr. 

The function z{r) is shown graphically in Fig. 41 for the same states 
as in Fig. 40. 

As may be seen from Figs. 40 and 41, there generally exist 
values of r for which R{r) vanishes. These correspond to spheres 
on which the probability density is zero {nodal spheres ). Figure 39b, 
representing the ^'probability cloud” for hydrogen in the state 
n = 2, Z = 0 ( 25 -state) shows an example of this condition. As 
may be seen from expression (285) for R, this function vanishes 
for r = 0 (except for the case Z = 0), for r == 00 , and also for the 
roots of the polynomial The spheres of radius zero and 

radius infinity are not counted as nodal spheres, so that the number 
of these spheres is equal to the number of positive roots of the 
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polynomial and it may be shown {PerrorCs theorem) that 

all the n' roots of this polynomial of degree n' are positive, so that 
the number of nodal spheres is equal to the radial quantum number 
n' = n — i ~ 1. 

60. Selection rules. As has been mentioned in §32, the inten¬ 
sity and state of polarization of the emitted radiation in the quantum 



jump from a state n to a state m are determined by the quantities 
Xnm, ynmy Znm defined by (144); that is, they are obtained by integra¬ 
tion over all space of the product of the eigenfunctions correspond¬ 
ing to the two states in question. We shall rapidly outline^ the 
calculation of these quantities for the case of an electron under the 


^ For a more complete treatment see, for example, No. 1 of the Bibliography. 
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action of a central force (hence, in particular, for hydrogenlike 
systems). 

Since each state is characterized by three quantum numbers, 
we must substitute a triplet of indices for each of the indices n and m 
just used. We shall denote the triplets by ni, Zi, mi and Z 2 , ^ 2 , 
respectively (and sometimes only by 1 and 2), so that the quantities 
to be calculated will be Xnaimxmumi (which we shall write 0^12 for 
short), and so forth. 

It is convenient to introduce, instead of the Cartesian coordi¬ 
nates X and 2 /, their linear combinations 

f = X + ty, 7 / = X — iy, 
so that, instead of X 12 , 2 / 12 , 212 , we shall calculate 

$12 = (U*U2 dSj \ 

»7i2 = j^vuiU2dS,\ (287) 

2i2 = zu^Ui dS. j 

It IS evident that one may pass from ^12 and tji 2 to X 12 and yi 2 
means of the relations 


$12 = ^12 + Wi2y V12 = ^12 — iy\2^ 

Introducing the polar coordinates r, 0, </?, we evidently have 

^ = r sin d 77 = r sin 6 z = r cos 0 , 

dS = sin 6 dd d(p dr. 

Keeping in mind that each of the functions Ui and 7/2 is made up 
of the product of the three factors ^((^), 0(0), R(r), we see that 
each of the triple integrals (287) splits into the product of three 
simple integrals: 

{12 = d(p jj 0 f 02 sin* ^ X * 

’712 ~ dip jj 0*02 sin* ^ dS r^RfR 2 dr 

Z 12 = ^2 d(p jj 0^02 cos S sin 6 dS r®/?*/J 2 dr. 
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Let us first deal with the integrals with respect to (p. The first 
of these becomes, upon substitution of the expressions for and ^2 
according to (229'), 


27r 





Thus it is equal to zero except for the case where the exponent 
vanishes, that is, when m 2 = mi — 1, in which case the integral 
is equal to 1. Similarly, the second integral with respect to 
vanishes except when m 2 = mi + 1, and the third integral vanishes 
unless m 2 = mi. Hence, indicating by Am the difference m 2 — mi 
(change in magnetic quantum number), we may say that in order 
that at least one of the three integrals not be zero, we must have 


Am = ±1, 0. (289) 

Otherwise all three quantities fi 2 , vn, Zn vanish, and hence also 
Xi 2 y ynj 2 i 2 , which means (see §32) that the corresponding spectral 
line has intensity zero, or else that a quantum jump in which m does 
not change by ± 1 or zero is “forbidden.'' Hence we have in (289) 
the selection rule for the magnetic quantum number. It may also 
be added that to a transition Am == 0 (only Z 12 different from zero) 
there corresponds the emission of plane-polarized radiation like that 
which would be emitted by an oscillator vibrating along the z-axis. 
To the transitions Am = ±1 there correspond Z 12 = 0 and 0:12 = 
+ ^ 2 / 12 ; that is, the radiation corresponds to that emitted by an 
electric moment rotating in the xy-plane (in the clockwise or 
counterclockwise direction) and hence is circularly polarized when 
viewed along the z-direction and plane-polarized when observed in 
the a: 2 /“plane. 

In analogous fashion we obtain the selection rule for the azi¬ 
muthal quantum number I from a consideration of the integrals 
with respect to d. Since the calculation is somewhat lengthy, we 
shall confine ourselves to mentioning the main result. We find that 
the three integrals (in which, of course, m 2 must be replaced by 
mi -f 1, mi — 1, and mi, respectively) all vanish except for the 
case where h — h ± 1, so that for a transition AZ = Z 2 — h we find 
the selection rule 


Al = ± 1 . 


( 290 ) 
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Finally, a consideration of the integral with respect to r (which 
is the same in all three formulas) does not furnish any selection 
rule, since it does not vanish (once the selection rules for m and I are 
satisfied). Therefore the jump An of the principal quantum num¬ 
ber n may be anything. 

Actually, experiments show that in spectra generally only the 
lines satisfying the selection rules just found for m and / occur. 
These rules, indeed, already had been discovered empirically before 
the rise of wave mechanics® and were justified in various ways in 
the Bohr-Sommerfeld theory (for example, by mt^aiis of the corre¬ 
spondence principle; see §64). 

The actual calculation of the integrals contained in (288) per¬ 
mits the quantitative evaluation of the probability for the quantum 
jump and the intensity of the corresponding spectral line. In the 
not very numerous cases in which it has been possible to make a 
comparison with experiment, the results of the theory have been 
confirmed. 

®In certain exceptional cases, lines which would be ‘‘forbidden” by the 
selection rules also show up, though w'ith considerably reduced intensity. This 
effect depends on various factors which the th(?ory accounts for perfectly in a 
further approximation (for instance, by considering w^eak quadrvpole transitions 
in addition to the strong dipole transitions which yield the selection rules). It 
is also to be understood that the transition between t^vo quantum states, even 
though prohibited by a selection rule, may always occur by an indirect route 
(that is, through intermediate levels) or by a process different from radiation, 
such as by collision for example. 



CHAPTER 9 


The Bohr-SommerFeid Theory 

61. The method of Wentzel and Brillouin. In this chapter we 
shall first of all treat an important method, originated by WentzeU 
and Brillouin^ and perfected by Kramers® and many others, of 
finding approximate expressions for the eigenfunctions and eigen¬ 
values of the Schrodinger equation. From this method we shall 
obtain a quantization r^ile which essentially coincides with the one 
postulated by Sommerfeld, upon which atomic mechanics was 
founded until 1925 (see Part I). Having thus obtained the funda¬ 
mentals of the Bohr-Sommerfeld theory as a first approximation of 
wave mechanics, we shall treat some of the most important results 
of that theory, which it would be too long or too intricate to deduce 
in a rigorous manner from the Schrodinger equation. 

We shall now expose the method of Wentzel and Brillouin as 
applied to the one-dimensional case, to which the more general 
cases may be reduced by means of the separation of variables. 
Hence we shall write the one-dimensional Schrodinger equation for 
a state of energy E [(140), §34] in the form 

u" + ^ P^u = 0, (291) 

where we have put 

p{x) = \/2m{E [/). (292) 

We note that for all values of x for which U < E, this function p 
is identical with the classical expression for the momentum (taken 
in absolute value) of the particle at the point x. In the regions 
where 17 > p is imaginary—^an indication, according to classical 
mechanics, that these regions of the x-axis are inaccessible to the 
particle. We shall suppose that the potential energy U has a shape 

1 Zeits, f. Physik S8, 518 (1926). 

* Comptes Rendus 183, 24 (1926); J, de Phys. 7, 353 (1926). 

* Zeits. /. Phyaik 39, 828 (1926). 
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of the type shown in Fig. 42, so that there is only one region AB 
(region II, from Xi to x^) in which p is real, whereas it is imaginary 
with positive coefficient in the other two regions (I and III), 
Classically, the particle would execute oscillations between A and B. 

The linear second-order equation (291) may be transformed into 
one of the first order which is nonlinear (of the Riccati type) by 
means of the well-known transformation from the theory of differ¬ 
ential equations, 

„ ^ (293) 


(where the lower limit of the integral is any fixed value of x)^ that is, 

by taking as a new unknown function 

= A . 

^ 27 ^^ U 



In fact it may immediately be verified, 
upon substituting into (291), that y 
must satisfy the Riccati equation 

y' y\ (294) 


If we consider that, in general, the results of wave mechanics 
approach those of ordinary mechanics if we neglect quantities of 
the order of A, we are led to believe that we may obtain a first 
approximation by neglecting A, a second approximation by neglect¬ 
ing A^ and all higher powers, and so forth. This procedure miggesis 
trying to find for y an approximate expression F, of the form 

y.y. + ^r. + (^yr,+ . . . +(^)‘r., (295) 

where Fo, Fi, . . . are functions of x which may be determined 
formally by substituting F for y in (294) and equating the coeffi¬ 
cients of each power of h/2Tn on both sides. We then find immedi¬ 
ately for the first few terms the following recursion relations: 


Fo = ±p, 

(296) 

r _ n 

“ 2Fo’ 

(2960 

ir . F{+Ff 

“ 2Fo ’ 

(296'0 


and the successive terms may readily be found. 
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As can be seen, we obtain for Y two expressions (which we shall 
indicate by Ya and Yb), depending on whether we take Yo == +p or 
Yo == — p in the first term. One of these is obtained from the 
other by changing the sign of all odd terms. They are (writing 
only the first two terms, and representing by dots the succeeding 
ones up to the A:th term) 


Ya 

Yb 


h p' , 


= ~p - 


2Tri 2p 


(297) 

(297') 


These two expressions approximately represent two different 
integrals of (294). 

It is useful here to clarify the significance of these approximate 
solutions. If, rather than breaking off one of these series at the 
kth term, we continue it indefinitely, we obtain a diverging series, 
and hence we may not consider y approximated by a series of the 
type (297) or (297'). However, we may show that in general the 
first few terms decrease rapidly and that, if the sum is limited to 
these terms, Ya and Yh represent, to a good approximation, two 
particular integrals of (294) for all points of the x-axis, except those 
near the two critical points A and B where p = 0. In fact, at 
these points Ya and Yb become infinite, whereas the solutions of 
(294) should not possess any singularities. 

From each of the two functions Y found in this 'way we may 
obtain an approximate solution of (291) by means of (293), and 
hence any solution of the latter may be approximated by an expres¬ 
sion of the type 


u 


—— Ca C 


2W f X 

~hJ 


Ya dx 


2iri Fx 
+ CbC^ j 


Yhdx^ 


(298) 


Ca and Cb being constants. This approximation, however, ceases to 
be valid in the vicinity of the two critical points A and B, It fol¬ 
lows that in order to represent the same solution u in the three 
regions I, II, III of the real axis, different values must be given to 
the constants Ca and Cb in each of the regions. We shall indicate 
these values by c', c'', c", c'", c^” respectively. We shall now 

see how these values must be related to each other, confining our¬ 
selves from now on to the approximation given by the first two 
terms of F« and Yb, so that (298) may also be written [see (297) 
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and (297')], fixing the lower limit at Xi and performing the integra¬ 
tion along the real axis by convention, 

2x1 f X j ^ 2x1 f X . 


We note first of all that for x — oo, II tends toward + oo and 
hence p toward It follows that the exponent of the first 

term tends toward + oo; and therefore, in order that u may tend 
toward zero for x —> — oo, as it should, we must have c' = 0. Thus 
in region I, u is reduced to 

u = —^ e h j XI ^ ( 300 ) 

Vp 

In region II, (299) may also be written (setting c'' = C" 6"*, 
c'' = C" e-') 

u = cos J p dx + S^, (301) 

In order to relate these two expressions, it is necessary to find 
an approximate expression for u that is valid in the zone where the 
preceding formulas break down, namely, in the vicinity of the 
point A, This expression may be obtained by replacing the 
potential curve at that point by a short straight-line segment (that 
is, by considering the force field to be uniform over the short portion 
considered), or by putting the following into the Schrodinger 
equation: 

^{E-U)= K,{x - X,), 


where Ki is a constant (> 0). The Schrodinger equation may 
then be solved rigorously, and u is found to be expressed by a 
Bessel function or, in another form, by means of a definite integral. 
This solution (which has no singularity for x = X\) must be con¬ 
tinuously joined, on one side (for x < Xi) to (300), on the other 
side to (301); this determines the values of the constants C" 
and d of (301). The calculations will be omitted, but it is founds 

^ See Kramers, loc. ciL, or No. 22 of the Bibliography. We may also sub¬ 
stitute a section of a step for the potential curve, in the vicinity of A. It 
may then be joined more easily to (301'), since this method avoids the use of 
Bessel functions [(see E. Persico, Nuovo Cimento XV, 133 (1938)]. 
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that in region II the continuation of the solution (300) is repre 
sented by 

cos f p dx ^ !![Y 
Vip \h 4/ 


u = 


(301') 


Proceeding to a consideration of region III, we find, by a 
method analogous to the preceding, that for u to vanish for x —> + , 

the second term of (299) must be absent; that is, u must have 
the form 


/// 2irt 


U = 


Vp 


- e 


!irt fx 

h j 


dx 


where c'" is a constant, undetermined for the present and of no 
interest here. This expression must be joined to one of the form 
(301), valid in region II, across the critical point B, The junction 
may be made by the same procedure followed for the point A, and 
we find that u must be represented, in region II, by 

2c:" 


u = 




cos 




Now in order that this expression be identical with (301'), the two 
cosine terms must be equal, in absolute value, for any x. This 
condition requires that the sum or the difference of their arguments 
be an integral multiple of tt. But since the difference of the argu¬ 
ments depends upon x, whereas their sum turns out to be constant, 
as we shall see presently, it will be the sum which must be equated 
to UTT (with n an integer), thus yielding 




K 


(302) 


Recalling that according to classical mechanics, the particle 
would oscillate between x\ and x^. with momentum +p in the motion 
from Xi to X 2 , and — p in the other half cycle, we may interpret the 
left-hand member as the integral of p dx (where p now indicates 
the momentum taken algebraically) extended over a complete 

oscillation of the classical motion, which is denoted by the symbol 
Therefore the last formula may be written 
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This formula does not contain the eigenfunctions, but only E 
(contained in p). It may therefore serve to determine (approxi¬ 
mately) the various eigenvalues of the problem without going 
through the Schrodinger equation. Upon assigning to n successive 
integral values, we obtain the successive eigenvalues of E, Equa¬ 
tion (303) thus represents an approximate quantization rule valid 
for all cases in which the potential has the behavior of Fig. 42. 

It is important to mention that if n is rather large, (303) may 
practically be replaced by 

p dx = nh, (303') 

Let us now consider the case where x represents an angle rather 
than a Cartesian coordinate, which may vary from 0 to 2t (for 
example, the angle 6 in plane-polar coordinates). In this case, 
whose importance will be better appreciated in what follows, U 
(and hence p) is a periodic function of x with period 27r, and w, 
rather than approaching zero at infinity, must also be periodic in x 
with period 2t, Let us suppose that E is sufficiently large for p to 
be real everywhere; that is, we refer to the case in which the classical 
motion would be rotational (not oscillatory). Then u must have 
the form (301) everywhere {xi is any fixed value of x) and will be 
periodic with period 2w only if 



with y indicating the integral extended over a period. Hence the 

quantization condition is in this case exactly (303') rather than 
(303). 

52, The Sommerfeld conditions. The considerations of the 
preceding section justify the very often spectacular success of the 
method of quantization postulated by Sommerfeld long before 
the rise of wave mechanics. This method consists, as we have 
mentioned in Part I, in treating the atom at first as a system of 
material points subject to the ordinary laws of classical mechanics 
(or, in a further approximation, of relativistic mechanics), and then 
in adding to these laws some restrictive conditions {Sommerfeld 
conditions) which allow only certain motions among the infinite 
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number of motions permitted by ordinary mechanics, and hence 
only certain values of the energy. Before formulating the Sommer- 
feld conditions, we must recall a few fundamental concepts of 
rational mechanics. 

Let us consider a mechanical system with / degrees of freedom, 
with constraints independent of time, and subject to conservative 
forces. We shall refer it to a system of generalized coordinates q\, 
^ 2 , . . . gf (which, in particular, may be the Cartesian coordinates 
of the single particles, if we are dealing with a system of points) 
to which there correspond as many momenta^ pi, p 2 , . . . Vf) 
let us suppose that its motion is such that each of the qi and the 
corresponding conjugate momenta Pt are periodic® functions of the 
time with period Ti. Motions which—by a judicious choice of 
coordinates—exhibit this property, are called multiple periodic 
motions. Incidentally, we note that in general, all the / periods 
Ti will be different. If these periods are equal or commensurable, 
the motion is periodic because, after a time which is a common 
multiple of all the Ti, the motion repeats identically. However, in 
general this condition does not obtain; hence the periodic motions 
are special cases of multiple periodic motions, characterized by the 
fact that we can find / integers mi, m 2 , . . . m/ such that 


mi 




® As is known from mechanics, the kinetic energy T of the sj^stem is a func¬ 
tion of the q and the g, and wc call conjugate momenta the quantities 


Vi 


dT{q, i) 


1,2, 


./)• 


Since T is a quadratic function of the the momenta will be linear functions 
of the it is possible to solve for them and express the q as linear functions of 
the p. 

In particular, if the coordinates q are the ordinar^^ Cartesian coordinates 
Xj y, z of a point, the corresponding momenta are the components of the linear 
momentum; that is, since 2" « }^m{x^ + + 2 ^), 

p, « mxj Py = my, p* = mh, 

® If one of the q represents an angle such that upon increasing it by 27r we 
again obtain the same configuration of the system (for example, the angle in a 
system of plane-polar coordinates), we consider as period Ti relative to this 
coordinate the time required for the latter to increase by 2ir. We then say 
that there is a degree of freedom of ‘^rotation,’^ while those coordinates which 
oscillate periodically between two limits are said to correspond to degrees of 
‘ffibration*’ or of ‘^oscillation." 
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or else, introducing the frequencies Vi — l/Tiy 

miPi = m2P2 = * • • = m/Vf. 

For the present we shall exclude not only this case but also the more 
general case in which between the frequencies Vi there exist one or 
more relations of the type 

*4* '^^'2^2 “h * * * ~}“ TTifVf = 0, (304) 

with integral coefficients. When there are g relations of this kind, 
the system is said to be g-Jold degenerate. If the motion of a system 
with / degrees of freedom is periodic, it is therefore (/ — l)-fold 
degenerate. Degenerate systems occur quite often in questions of 
atomic physics, but we shall deal wdth them elsewhere. 

Among the multiple periodic systems, we shall be particularly 
concerned with those possessing a system of generalized coordinates 
such that each of the momenta pi is a function only of g,- and 
not of the others. This condition is to be expressed by saying that 
^Hhe variables are separable.^^ In fact, this statement is equiva¬ 
lent to saying that the Hamilton-Jacobi equation*^ can be satisfied 
by a function of the form 

w = 2/ife) (308) 

i 

rather than by the more general form IF =/(gi . . . g/). All these 
conditions seem very restrictive, but in practice the majority of 
systems which occur in the study of atomic mechanics satisfy them, 
and hence the Sommerfeld conditions which we are about to state 
may be applied to them. 

^ From analytic mechanics we recall the method of Jacobi for the integration 
of the equations of motion, when the constraints are independent of time and 
the forces are conservative. We express the total energy (sum of kinetic 
energy T and potential energy U) as a function of the q and p [the function 
H{qf p) defined in this way is called the Hamiltonian of the system, and con¬ 
tains within itself everything needed to characterize the mechanical properties 
of the system]. Then we substitute for each pi in the expression for H the 
quantity dW fdqi (where W is an unknown function of the g), and write the 
partial differential equation (Hamilton-Jacobi equation): 



where ai is an arbitrary constant. We are now to find a solution Wiq^ ai, at, 
a/) of this equation which, in addition to ai, contains / — 1 other arbi- 
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The motion of the system with / degrees of freedom depends, 
as we know, on 2/ constants ai, a 2 . . . a/, jSi, P 2 • • » Pf which 
are defined by the initial conditions. Sommerfeld imposes a restric¬ 
tion upon these constants, forcing them to satisfy the / following 
conditions. For each coordinate, we calculate the so-called phase 
integral 

( 309 ) 

extended over an entire cycle of the same coordinate. Since pi 
depends only on qi (by hypothesis) and on the / constants a, this 
integral will be a function Ji{ai . . . o:/) of the constants a only. 
We may also invert these functions and solve for the a as functions 
of the / constants J. The Sommerfeld conditions consist in setting 
each one of the integrals J equal to an integral multiple of Planck’s 
constant, that is, in writing 

^ Pidqi = Uih {i = 1 , 2 , . . . /), ( 310 ) 

where Ui is a nonnegative integer.^ Hence there are as many con¬ 
ditions as there are degrees of freedom, and we introduce as many 

trary constants as, as ... a/. Having found IT, we obtain the equations of 
motion by writing the following 2/ relations between the q, the p, and t (from 
which the q and the p may be found €‘xplicitly as a function of t ): 


dW 

da\ 

= t Pi 



(306) 

ow 

da 


u = 2, 3, .. 

• /), 

(306') 

Pi 

dW 

dqi 

a = 1 , 2,3,. 

■ ■/), 

(307) 


where the /3 are / other arbitrary constants. 

Of these equations, the (306'), which do not contain determine the form 
of the trajectory of each point; (306) determines the law according to which 
points move along the trajectory; and (307) determine the momenta and hence 
the velocities (as hinctions of the q and a). The constant ai has the physical 
meaning of totxil energy. The constants which do not occur in (307) have 
the meaning of phases. 

Often equation (305) may be solved by separation of variables, that is, 
by taking W of the form (308), where of course each depends also on the 
/ constants a. Then evidently (307) will assume the form pi =/i'(gi, a), or 
each pi depends solely on its conjugate qi and on the a. This is the case to 
which the Sommerfeld conditions may be applied. 

* In fact, it may be easily shown that the integral in the left member is 
never negative. 
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quantum numbers Ui which may replace the / constants a. Thus 
instead of introducing / constants capable of assuming continuous 
values, we introduce an equal number which take on discrete values. 

Now since, as we have seen, one of the a is identified with the 
energy E of the system, we may regard the energy as a function of 
the / constants J, and hence of the n^ We conclude that the 
energy may take on only discrete values, depending on the / inte¬ 
gers ni, ^ 2 , . . . Uf] these values may be designated by Enmr-nf. 

Thus, as will be seen, the energy levels are generally found to 
be in good agreement with experimental values, and sometimes in 
excellent agreement with them. 

In the special case of a single degree of freedom, the Sommerfeld 
condition coincides with (303'), which we have deduced in an 
approximate way from wave mechanics. However, we note that 
in the case of one oscillatory degree of freedom we have found that 
the best approximation is obtained by equating the integral to 
(n + }i)hj whereas in the Sommerfeld method it is equated to nh 
as for the rotational degrees of freedom. Actually, in these cases 
the introduction of half-integral numbers (that is, of the type 
^ + yi) instead of integral quantum numbers, generally improves 
the approximation given by Sommerfeld, as was also observed 
empirically even before the rise of wave mechanics; and in certain 
cases, such as that of the oscillator, it immediately furnishes the 
exact result. On the other hand, for rotational degrees of freedom 
the quantum numbers must be integers, as we have seen. 

In the case of a system with several degrees of freedom and 
with separable variables, the Sommerfeld conditions may be found 
again in an analogous manner (through extension of the considera¬ 
tions of §51 by means of separation of variables), as a consequence 
of first approximation of the Schrodinger equation and of the con¬ 
ditions of regularity imposed upon the eigenfunctions. 

63. Note on the choice of the coordinate system. Since a 
mechanical system may be referred to an infinite number of general¬ 
ized coordinates, the following question arises: If instead of the 
system of the g, we adopt another coordinate system gj, Q'J, . . . q^f 
(whose conjugate momenta will be called pi, pi, . . . P/) and if we 
apply to them the Sommerfeld conditions 
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do we get the same quantum orbits? First it is necessary to observe 
that in order to be able to apply the conditions (311), we must be 
able to separate variables in the system of the q' as well. This state¬ 
ment implies (if the system is not degenerate) that the transforma¬ 
tion leading from q to q' must be one of a special, limited class, 
namely, one which neither alters the coordinate lines nor modifies 
the integrals J (or, at the most, performs linear transformations on 
them, with integral coefficients and with a determinant of unity). 
In this case equations (311) determine the same trajectories as (310), 
generally yielding different values for n' in the two cases. 

If the system is degenerate, the situation is different. Then the 
separation of variables is possible even for essentially different 
coordinate systems, which lead to values for J that are not reducible 
to the previous values by linear transformations with integral 
coefficients and unit determinant; and hence quantum trajectories 
are obtained which are different according to the system of reference 
selected. However, in these cases it also happens that the energy 
levels prove to be independent of the choice of reference system. 
This result is attributable to the fact that the energy levels have 
physical significance, whereas the orbits of the Bohr-Sommerfeld 
theory are only an analytic device (see §57). 

Thus in the treatment of degenerate systems there is a certain 
degree of arbitrariness in the selection of the coordinate system. 
Consequently, we generally let ourselves be guided by the following 
criterion. We observe that if we 'perturb the system ever so slightly 
(for example, by slightly changing the forces which act upon it), the 
system generally ceases to be degenerate, because its periods Ti 
vary slightly and no longer satisfy relations (304). Now very often 
a system appears to be degenerate only because, for simplicity, we 
neglect some of the slight perturbations which are really present. 
For instance, an atom is usually considered uninfluenced by all 
external causes, but in reality it is always located in a magnetic 
field, no matter how small, caused by neighboring atoms, terrestrial 
magnetism, or other factors. It is therefore sufficient to take these 
perturbations into account and to look for a system of reference in 
which the variables are separable. This same system of coordinates 
is then chosen to treat the problem, even when the perturbations 
are being neglected. In fact, it is clear that if the perturbing forces 
are being reduced to zero gradually, their effect upon the motion 



238 


WAVE MECHANICS OF A PARTICLE 


[§53 


will also tend to zero, whereas their effect in fixing a system of 
reference in which the Sommerfeld conditions apply is independent 
of their magnitude. 

64. The harmonic oscillator. We shall again take up the prob¬ 
lem studied in §39 in order to treat it by the Sommerfeld method. 
First, from mechanics we know that the particle would execute 
oscillations according to 

X = A sin (2x^0^ <p)f (312) 

where A and <p are two arbitrary constants. The momentum 
conjugate to x is 

p == mx = 2'irvoinA cos (2xvo^ — <p). 

The total energy is [see (183) in §39] 

E — U + — 2w‘^mvlx‘^ + = 2Tr‘^vl A'^m 

and is seen to contain only one of the two constants of integra¬ 
tion. We shall now calculate the phase integral 

J = (^ p dx = mx dx = m ^^^*^** dt 

= ^ cos^ (2x1^0^ <p) di. 

Since the last integral is equal to l/2i/o, we have 

J = 2r^voA'^m = 

yo 

hence the Sommerfeld condition J — nh gives for £' the values 

En = nhvQ (n = 0, 1, 2, . . .). (313) 

Thus we find the law (already postulated by Planck in the 
theory of black-body radiation) that the energy of the oscillator is 
always an integral multiple of the quantum^’ hvf^. If we use the 
formula (303) instead, which is a better approximation when dealing 
with a motion of the oscillatory type, we find 

En = (n + y 2 )hv,, (313') 

that is, there is added, to the n quanta hvQy a fixed amount of 
energy yhvo which the oscillator always possesses zero-point 
energy see also page 183). The same result may be obtained by 
the Schrodinger method and is confirmed in various ways by 
experience. Thus in this special case the method of Wentzel and 
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Brillouin (accidentally) furnishes the exact result, as far as the 
energy is concerned. 

66. The rigid rotator and the Bohr atom. A rigid rotator is a 
system constituted of a rigid body revolving around a fixed axis, 
not subject to forces (or subject to forces whose moment is zero with 
respect to the axis). The system has a single degree of freedom; 
its position at each instant may be specified by means of an angular 
coordinate By expressing the angle between two planes passing 
through the axis of rotation, one ])eing fixed in space and the other 
fixed in the body. The energy of the system, which reduces to 
kinetic energy alone, is given by 

E = ]iU\ 

where I indicates the moment of inertia. 

The momentum conjugate to the coordinate 6 is hence (see 
footnote 5 in §52) 

M = 16] 

that is, it represents the angidar momentum. As is well known, the 
motion takes place according to the law M — const., so that 

^ M dd == M j)de = 2 tM. 

The Sommerfeld condition therefore gives 

M = n^, (314) 

and may be expressed by saying that the angular momentum must 
be a multi'ple of h/Zir.^ Hence the angular velocity must have one 
of the discrete values n(/i/27r/), and, substituting into the expres¬ 
sion for Ey we find for the energy the values 

E. = (315) 

By chance, this result also coincides perfectly with the one 
obtained from a rigorous solution of the Schrodinger equation when 
generalized to include such a system. 

® Since in many other cases too, the angular momenta result as multiples 
of (/i/27r), or at least as simple ratios to this quantity, h/2'n is often taken as 
unit of measurement for angular momentum {quantum unit). For example, 
the result expressed by (314) may be stated by saying, “The angular momentum 
of the rotator is given by an integral number n” (implying quantum units). 
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We note that the hydrogen atom as conceived by Bohr (see §16 
of Part I), in which the electron is constrained to move in a circular 
orbit, may be mechanically identified with a rigid rotator, and 
hence (314) must also hold for a circular orbit. Thus we see that 
the condition postulated by Bohr to characterize the quantum orbits 
is contained as a special case within the Sommerfeld conditions. 

66. Hydrogenlike systems. We shall now apply the Sommer¬ 
feld method to the hydrogen atom and, in general, to hydrogenlike 
systems, without the purely artificial restriction of circular orbits as 
adopted by Bohr in the first quantum theory of these systems and 
outlined in §16 of Part I. However, we shall still suppose that the 
nucleus is fixed, for the time being. Hence the system is one with 
three degrees of freedom. 

(a) Mechanical part. It is known from rational mechanics that 
the motion of an electron under the action of a central attractive 
force of intensity Z (that is, analogous to the Newtonian 
attraction) takes place in elliptic orbits according to Kepler^s laws, 
excluding the parabolic and hyperbolic orbits which correspond to 
states in which the electron is not bound to the nucleus. If polar 
coordinates r and o) are introduced in the plane of the orbit (with 
the pole at the nucleus and the polar axis directed toward the 
perihelion), the equation of the ellipse becomes 


7 = (316) 

The two constants M and e are determined by the initial con¬ 
ditions; M represents the angular momentum^® {M = mr^6)), and e 
the eccentricity (we are dealing with ellipses, for which e < 1). In 
order to find the semiaxes, we note that the maximum and minimum 
values of r, that is, the aphelical and perihelical distances (corre¬ 
sponding to o) = 180° and co = 0°), are 


rmax 


1 

Z e^m 1 — 


Z e^m 1 + e 


(317) 


and since the semimajor axis a is evidently given by 3^(r,n*x + r^in), 
we will have 


1 

We select the positive sense of w coincident with the sense in which the 
ellipse is traversed, so that M is not negative. 
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6 = a Vl - (319) 

The kinetic energy is 

YY) 

r = I (r^ + (320) 


and the potential energy is — Z e^/r. To calculate the total energy 
= r + [/ it is convenient (since this energy is a constant) to 
refer to a particular instant of the motion, selected in such a way 
as to simplify the calculation. For instance, for rmm we have f = 0; 
then we find that 


E = 


1 

2 a 


(321) 


and note that the energy depends solely on the semimajor axis, 
and not on the semi minor axis. 

(b) Sommerfeld conditions. We note that the system is doubly 
degenerate (since the three coordinates all vary with the same 
period). In order to select a system of coordinates (see §53) it is 
therefore necessary to take some perturbation into account: one of 
these is the relativistic correction applied to the laws of mechanics, 
and another might be the action of a weak external magnetic field. 
It is found that a system perturbed in this way is no longer degener¬ 
ate and that the variables are separable if polar coordinates (in 
space) are taken with the pole at the nucleus and the polar axis 
directed along the field. Hence we shall adopt these coordinates 
even when neglecting the above-mentioned perturbations. There¬ 
fore let r be the radius vector, B the colatitude (angle between the 
radius vector and the polar axis), and ip the longitude. The kinetic 
energy then has the known form 


r = (r2 + + r2 sin^ip^) (322) 

and hence the momenta conjugate to r, 0, ip are, respectively, 

Pr = 

Me — mr^Bj 

= mr^ sin^Bip. 

The last one has the mechanical meaning of angular momentum 
with respect to the polar axis^^ or projection upon the polar axis of the 
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angular momentum M. This projection is a constant in the absence 
of external forces. 

We now may write the Sommerfeld conditions, which are 


p Pr dr = m (j 

P r dr = ?i'h, 

(323) 

Me do = m 

f) 7^6 dd = shy 

(324) 

M d(p — m d 

5 r^ sin‘^0^ dip = m^h. 

(325) 


where n', s, m^ are three nonnegative integers. 

Since is a constant, the last equation immediately gives 


±27ril/^ = m^h, 

where the + or -- signs hold according to whether the motion takes 
place in the direction of increasing or decreasing (p. Adopting the 
convention of indicating^^ by m* the number ±m^ (where the sign 
is chosen by the preceding criterion), we have 





(325') 


which amounts to saying that the angular momentum with respect 
to the polar axis is expressed (in units h/2Tr) by an integer m* 
(| 0) which is called magnetic quantum number (and which corre¬ 
sponds to the magnetic quantum number m of the wave theory 
introduced in §46). 

An analogous quantization for the (total) angular momentum 
M = |M| may be obtained from (324) and (325). We note first 
of all that upon equating the expressions (320) and (322) for the 
kinetic energy, and multiplying through by dty we get the identity 

do + sin^0^ d<p = r^u do). 

Then, adding the last two Sommerfeld conditions (324) and (325), 
and using the last identity, we obtain 

m r^6) do) = (s + m^)h. 

For the present we shall adopt the notation m* for the magnetic quantum 
number to avoid confusion with the electronic mass m. Later on, when there 
is no more reason for misunderstandings, we shall write m instead, as is 
customary. 
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M = (326) 

if we now define a new (nonnegative) quantum number k, putting 
/c «= 5 + = s + |w*l, (327) 

we may write^^ 

^ M dw = kh. (328) 

Since M is a constant, we obtain from (328) 


M 



(329) 


or: the total angular momentum is an integral multiple of h/2'K. The 
quantum number k measuring this angular momentum in units of 
/i/27r was called azimuthal quantum number in the old Bohr-Sommer- 
feld theory. This designation is now reserved for the number I of 
wave mechanics, which, as we shall see later on, may be identified 
with (fc — 1). 

(c) Spatial qriantization. The last two Sommerfeld conditions 
determine the inclination of the orbital plane with respect to the 
polar axis, or else with respect to an external magnetic field (which 
may be very weak). In fact, given the angle a that the vector M 
(which is normal to the orbital plane) makes with this axis, we have 

= M cos a; 


and hence, substituting expressions (325') and (329) for and 
M, we find (remembering, from what was said above, that k 7^ 0) 

cosa = -y (330) 

It follows that with k fixed, cos a may take on only the series of 
discrete values 0, ±l/k, ±2fkf ... ±1. However, in order to 

It is apparent that since M is the momentum conjugate to w in the system 
of plane-polar coordinates (in the plane of the orbit), (328) may be interpreted 
as one of the two Sommerfeld conditions which would be obtained when the 
problem is treated in two dimensions, as if the electron moved in a given plane. 
The other is identical to (323). This way of treating the problem, that is, 
with the somewhat artificial restriction that the motion take place in a given 
plane, is the one usually followed in the elementary expositions of the Sommer¬ 
feld theory. 
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make our treatment agree with experimental facts, we must exclude 
the case m* = ±/v, that is, cos a = ±1, in which case the plane 
of the orbit would be perpendicular to the magnetic field. This 
exclusion, for which the old Sommerfeld theory gives no satisfactory 
justification,^® may be justified today by means of a comparison 
with the results of wave mechanics, of which this theory must 
represent an approximation (see §57 below). Assuming this for 
the moment, we arrive at the conclusion that the magnetic quantum 
number m* takes on the {2k — 1) values 

^{k - 1), -{k ~ 2), . . . -1, 0, 1, - 1), (331) 

to which there correspond as many inclinations of the orbital plane. 

The existence of these discrete inclinations is often denoted by 
the expression “spatial quantization.^^ 

It is apparent that the shape of the orbit does not enter in 
(324) and (325). Therefore, the results which we have deduced 
from these expressions (angular momentum quantization and 
spatial quantization) hold for any central motion, even if the force 
is not of the Coulomb type. 

(d) Shape of the orbits. We must now take into account the 
remaining Sommerfeld condition (323), where n' (= 0, 1, 2, . . .) 
is called radial guantum number. 

The integral is calculated by taking o) as variable of integration 
and noting that 

. dr . M dr 
^ do)^ mr^ dJ 

so that we may write 

^ " r ? (0 r (Hr)' 

substituting expression (316) for 1/r, we obtain 

sin^ w do) 

(1 + € cos cu)^ 

The integration may be effected by first integrating by parts 
and then making the transformation cos w = (1 — x^)/{l + 

In the old quantum theory the value w* « 0 was usually excluded, for 
reasons analogous to those which lead us to omit A; » 0; the result, however, 
was not at all satisfactory (see Geiger and Scheel, Hdb. der Physik XIII, 1st 
ed., pages 145 and 164). 
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(or by a more rapid and (jlegant method due to Sommerfeld*'*), and 
we find 




mr dr — 'ZttM 



0- 


Hence, taking into account (329), (323) yields the following con¬ 
dition for e: 


1 



or else, by introducing a new integer (which is to be identified with 
the ^^principal quantum number” of §47), 


vr 


71 — k n'. 



(332) 

(333) 


Substituting this expression for 1 — into (318), and (329) 
for Mj we obtain for a the expression (depending only on n) 


an 




e^m 


rr 


(334) 


Similarly, from (319), we see that the ratio of the semiaxes is 


I = 

a n 


(335) 


Generally the two integers k and ii (rather than k and n') are 
used to characterize an orbit. 

We see that k is less than or, at most, equal to 7i, since n' cannot 
be negative. We have k — that is, = 0, in the case of circular 
orbits (hence the n introduced into the Bohr theory, restricted to 
circular orbits, may be considered to be either an azimuthal or a 
principal quantum number). In addition, we must exclude the 
case k == 0. This again is a restriction which may be justified 
rigorously only by a comparison with the Schrodinger theory 
(see §57). In the old theory this restriction was justified by the 
argument that the case k = 0 would correspond to 6 = 0, as (355) 
shows; that is, the ellipse would degenerate into a line segment, and 
the electron would collide with the nucleus. 

See, for example, No. 18 of the Bibliography, page 655. 
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We then assume that the azimuthal quantum number k, with 
fixed n, may take on only the 7i values 

fc = 1, 2, . . . n, (336) 

and that to these there correspond as many ellipses, all having the 
same semimajor axis but different semiminor axes. The last one is 
the circle of radius an] of course all these ellipses have one focus at 
the nucleus (see Fig. 43). 

(e) Energy levels. We note first that all ellipses corresponding 
to the same n and having the same a will have the same eneryg. 
Therefore the energy depends only on the principal quantum num¬ 



ber n and not on k. Substituting (334) into (321), we obtain, for 
the energy of any one of the orbits of principal quantum number n, 


En 


2T^Z^e^m 1 


(337) 


an expression which is identical with that already found for the nth 
circular orbit in the Bohr theory (see §16 of Part I). 

Hence the consideration of the elliptic orbits in hydrogen and 
in hydrogenlike systems does not add any new energy levels. For 
this reason the Bohr theory, although limited to circular orbits, 
already accounted for all observed spectral lines. The more com¬ 
plete theory, which has just been developed, tells us, however, that 
to each energy level there generally corresponds more than one type 
of motion because of the already mentioned degeneracy of the 
system. In fact, having fixed n (that is, the major axis), the 
azimuthal quantum number k may assume the values (336), and 
we therefore get different ‘‘shapes’^ of orbits. To each value of 
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k there correspond (2k — 1) values of m*, and hence each orbit 
may assume as many inclinations. Therefore the total number 
of orbits corresponding to a given principal quantum number n 
and hence to a given energy level (also called statistical weight or 
multiplicity of the level) is 

n 

^ (2fc - 1) = n\ 

Thus to the first level there corresponds one orbit, to the second 
level four orbits, and so forth. Every spectral line in general 
results from several kinds of quantum jumps. For instance, the 
first line of the Balmer series is emitted by all atoms in which the 
electron jumps from any one of the orbits n = 3, A: = 1, 2, 3, to 
any one^^ of the orbits n = 2, A; = 1, 2. 

It is to be noted that a slight perturbation (for example, an 
electric or magnetic field) is sufficient to remove, totally or partially, 
the perfect overlapping of the energy values corresponding to the 
orbits of the same principal quantum number n. Each energy 
level will then split into a group of very closely spaced levels, 
corresponding to different values of k and of m*. 

57. Comparison with the Schrodinger theory. The theory of 
the hydrogen atom developed in the preceding section must be 
considered, as we know, as a first approximation of the Schrodinger 
theory, in the sense that the numerical values which are obtained 
from it for the various quantities having physical significance 
(energy, angular momentum, and so on) are approximately equal 
to those deducible from the rigorous theory developed in Chapter 8. 
However, the energy levels turn out to be exactly the same. The 
values of the angular momentum with respect to the axis, that is, 
of are also exactly the same, provided that we assign to the 
magnetic quantum number m* only the values (331) (excluding the 
values ±A:) and that we identify k with (I + 1) of the wave theory 
(from which the exclusion of the value A: = 0 follows). Hence the 
multiplicities of each level are also the same in the two theories. 

On the other hand, the expression (329) of the angular momen¬ 
tum is only approximate. In fact, we have already pointed out 

Later it will be seen that some of these quantum transitions do not occur 
in reality, since they are excluded by selection rules. 
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that in the rigorous theory the latter is given (in units of /i/27r), not by 
k but by \/l{l + 1) or else by \/lc(k — 1), or \/ld. In particular, 
for k — 1 we should get M = 0, whereas (329) gives M = h/2T, 
Sometimes it may be convenient to substitute for (329) the 
formula 

M = = (329') 

J!dTr JiiTT 

that is, to measure M (in quantum units) by I rather than by k. 
Equation (329') gives a slightly better approximation than (329) 
and immediately gives the right result for k = 1. 

Let us now examine the ph 3 \sical significance which may be 
attributed to the orl^its of the Bohr-Sommerfeld theory and to the 
motion of the electron along these orbits. In order physically to 
define the trajectory of the electron, as well as its motion, it would 
be necessary to be able to detennine (at least conceptually) the 
position of the electron at successive instants, with uncertainties 
small compared with the dimensions of the orbit, and to repeat the 
observation many times upon the same atom. But in order to 
perform an observation of position with an uncertainty small com¬ 
pared with the radius a of the orbit, it is necessary, by the uncer¬ 
tainty principle, to impart to the electron an indeterminate change 
in momentum of the order of magnitude AP > ~ h/a. Hence in 
the succeeding observation the motion is already perturbed. We 
see, then, that the physical determination of the orbit is conceptually 
possible (approximately) only if the perturbation AP is negligible 
compared with the momentum P which the electron possesses in its 

orbital motion. Now, P = — = n. Hence the condition is 
' a 2Tra 

h h ^ 

- n or 71» 27r. 

a 27ra 

Therefore we may say that the larger n is, the more exactly may 
an orbit be physically determined. We cannot attribute any 
physical significance to the first few orbits, for which n = 1, 2, . . . . 

In order to clarify what has been said above and to throw some 
light upon the relationship of the Bohr-Sommerfeld model to wave 

Here, for simplicity, we refer to circular orbits, but the reasoning may 
immediately be extended to elliptic orbits by replacing a and P by their average 
values. 
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mechanics, let us suppose that we wish to determine the successive 
positions of an electron within an atom, by irradiating it with 
radiation of wavelength short compared with the dimensions of the 
atom and observing the scattered light, just as in §23. (Obviously, 
we are dealing with idealized experiments.) If the atom is in one 
of the lower quantum states, its dimensions will be of the order of 
10"^ cm, and therefore X rays will have to be used. But it is well 
known that one of the X-ray photons is sufficient to ionize the 
atom, and hence the electron is already expelled from the atom in 
the first observation, so that it is impossible to repeat the observa¬ 
tion on the same atom. It is therefore not feasible to detect the 
orbit, but only one position of the electron with respect to the 
nucleus, and this operation may be performed with as great a pre¬ 
cision as desired. If we then repeated this observation on many 
atoms (all in the same state), we could establish the statistics of 
the results obtained; or we could attribute to every point of space 
around the nucleus a ‘^probability densityof finding the electron 
there. This is just the probability density which wave mechanics 
shows us how to calculate. Thus, for instance, if the atoms in 
question are all in their ground state (n = 1), the results of the 
position observations on the electron will be symmetrically dis¬ 
tributed around the nucleus, becoming denser as we get closer to 
the nucleus, as shown by Fig. 39a. 

If instead we were to consider atoms in a higher quantum state 
and hence of larger dimensions, we could be content to observe 
the position of the electron with less precision and hence with light 
of lower frequency, incapable of producing ionization. We can 
therefore think of repeating the observation many successive times 
on the same atom, thus approximately detecting the orbit of the 
electron and the law of its motion. But since, by the first observa¬ 
tion, we have already perturbed the momentum (and the energy) 
of the electron in an uncontrollable way, we may not assign that 
orbit to a definite quantum state, and the indeterminacy of the 
state (that is, of n, k, m) will be the larger the more precisely the 
measurements of position have been carried out. This entire argu¬ 
ment is faithfully represented in the scheme of wave mechanics. 
In fact, by superimposing a certain number of eigenfunctions corre¬ 
sponding to different values of n, ifc, m, we may build up as compact 
a wave packet as desired, which (see §26) moves around the nucleus 
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and executes approximately the motion of the electron in the Bohr- 
Sommerfeld theory. This packet, if properly constructed, repre¬ 
sents the probabilit}^ density at every instant and hence permits 
the results of succ^essive position observations to be predicted 
statistically. If these observations are performed with great pre- 
(jision (that is, if we make our observations with radiation of very 
short wavelength, compatible with the necessity of avoiding ioniza¬ 
tion), we must constru(^t a very small wave packet, for which it 
will be necessary to superimpose eigenfuru^tions belonging to a wide 
range of values for n, 1 % m; that is, we must leave the quantum 
state highly indeterminate. Hence we see that the concept of 
orbit” and that of quantum state” are complementary in the 
sense of Bohr (see §22), just as are those of position and velocity, 
and so on. 

Evidently, then, it is not possible to construct a sufficiently 
small wave packet if we want to use only the first three or four 
eigenfunctions, for example. This impossibility corresponds to the 
fact, already mentioned, that the orbits corresponding to the lowest 
values of n have no physical significance. 

58. Motion of the nucleus. An import^ant refinement which 
must be made in the preceding theory of hydrogenlike systems con¬ 
sists in accounting for the fact that the nucleus is not strictly fixed, 
as we have assumed up to now, but describes a small orbit about 
the center of gravity of the system. As is known from mechanics, 
the two-body problem may be reduced to that of a single body 
attracted by a fixed center, provided that we modify the mass of the 
moving body slightly. The motion of an electron of mass m with 
respect to the nucleus of mass M is governed by the same equations 
as the motion, with respect to a fixed nucleus, of a particle of mass 
{reduced mass) 


m = 


m 


1 + 


m 

M 


(338) 


The Sommerfeld conditions are also obtained from those for 
the fixed nucleus by simply substituting m' for m. This fact is 
understood most easily if we take for generalized coordinates of the 
system the polar coordinates of the nucleus with respect to the 
center of mass (ro, Bq, v?o), and the polar coordinates of the electron 
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with respect to the nucleus (r, 6 , <p). We then find that the momenta 
conjugate to the former are identically zero, and those conjugate 
to the latter three are 


Pr = m'fy Me = — m'r^ 

Hence, of the six Sommerfeld conditions, the first three are identi¬ 
cally satisfied, and the other three coincide with (323), (324), (325) 
except for the substitution of m' for The energy levels are 

therefore given by 

27r^Z2 1 „ Rufhc 


where we have put (in analogy to the expression for the Rydberg 
constant given in §16 of Part I): 


Rm — 


2ir^ e'm' 


R 


ch^ 


and the spectral terms are 


1 + m/M 


= Z2 


Rm 


(339) 


Therefore the effect of the motion of the nucleus amounts to 
replacing the Rydberg constant R by the slightly smaller con¬ 
stant R^f. This correction, which depends on the mass ilf, is 
slightly different for different hydrogenlike systems. For instance, 
we find that 


for IT: Rn = 109,077.58 cm-^ 

for He"^: 2 ?ho+ = 109,722.26 cm"“h 

These results indicate that the frequencies of the same line in the 
various hydrogenlike spectra should not be exactly proportional 
to Z^. For example, the even lines of the Pickering series will not 
exactly coincide with those of the Balmer series (as would be 
expected from the first approximation of §16 of Part I) but will be 
slightly displaced toward the violet. This is actually the case, and 
the difference (which is of the order of one angstrom) is perfectly 
observable in the spectrum given by a mixture of hydrogen and 
helium. In fact, the measurement of these differences constitutes 

Note that M still has the significance of total angular momentum of the 
system; hence the azimuthal quantum number k retains its meaning. 
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one of the most exact methods for the determination of the ratio 
of the mass of the electron to the mass of the nucleus. The value 
of R given by (21) of §16 is the one that would correspond to a 
nucleus of infinite mass, and is therefore sometimes designated 
by Eoo. 

An entirely analogous correction may be made in the Schrodinger 
theory and will lead to the same result (see §21 of Part III). 

69. Atoms with several electrons.For atoms (or ions) with 
more than one electron, the J3ohr-Sommerfeld method would lead 
first of all to a consideration of the mechanical problem of the 
motion of the electrons under the action of nuclear attraction and 
their mutual repulsions. (This repulsion is not quite negligible, 
as is the attraction of the planets in the astronomical problem, to a 
first approximation.) The solution of this problem is in general 
practically impossible. But even where an approximate solution 
has been found (in the case of the helium atom, with two electrons), 
it was realized that by quantizing its motion by the Sommerfeld 
conditions, energy levels which were grossly in error were obtained. 
Such a result implies that in this case the approximation given by 
the first two terms of (295) is entirely inadequate. Nevertheless, a 
series of qualitative results of the greatest importance was obtained 
by means of the following considerations. 

Since the spectral terms of many atoms and ions may be repre¬ 
sented in the Rydberg or the Ritz forms (see §15 of Part I) which 
are analogous to the Balmer form, Bohr was led to consider that 
the emission of these spectra is produced by the quantum jump 
of only one of the electrons, while the other electrons retain their 
orbits almost unchanged. Thus we see the possibility of con¬ 
sidering (as an artifice) the atom or the ion to be composed of two 
parts. One part, consisting of the nucleus and all the electrons 
but one, is called core or remainder of the atom, and is considered to 
have an invariant structure, to a first approximation. The other 
part, consisting of the remaining electron, which is called emission 
electron, may travel on various quantum orbits under the action 
of the forces exerted by the core. This electron is usually one of 
the so-called valence electrons, that is, one of the electrons which, 

^*See a treatise on spectroscopy (e.g., G. Herzberg: Atomic Spectra and 
Atomic Structure, New York: Dover, 1944), or else No. 23, 27, or 32b of the 
Bibliography. 
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because they are less strongly bound to the atom, may be more 
easily removed from the atom, when it is transformed into a posi¬ 
tive ion. 

The forces which the core exerts upon the emission electron, 
though naturally rather complex, may be schematized by consider¬ 
ing all the electrons of the core as forming a continuous distribution 
or “cloud'’ of negative electricity around the nucleus, which is static 
and spherically symmetrical so as to give rise to a central force field 
which is superimposed upon the field due to the nucleus.^® We 
know that such a field, for points outside of the cloud, is the same 
as if it were concentrated at the center, or else as if the positive 
charge of the nucleus were diminished by an amount equal to the 
agglomerated charge of the electrons constituting the cloud. If 
there are a electrons, the charge of the nucleus is effectively reduced 
from Ze to {Z — a)e {effective charge). Therefore the emission 
electron is attracted toward the core (as long as it is outside the 
core) by a force {Z — a) instead of Z e^IrK Hence it is said, 
somewhat inaccurately, that the electrons of the core exert a 
screening action upon the nuclear attraction, and the number a is 
called the screening constant. If, for instance, we are dealing with 
a neutral atom, we shall have a = (Z — 1), and hence the effective 
charge will be e. If we arc dealing with an atom ionized z times, 
a will be a = (Z — 1 — 2 ), and hence the effective charge will be 
{z + l)e. 

If, however, the emission electron penetrates into the core, the 
screening action will decrease, because, as we know, the field pro¬ 
duced by a spherical distribution of electricity at a point within it, 
at a distance r from the center, is the same as would be produced 
by that portion of the electric charge which is contained within the 
sphere of radius r, whereas the external charge remains ineffective. 
Hence for an electron penetrating the core, the effective charge of 
the nucleus gradually increases with decreasing r and approaches 
the true charge Ze when r approaches zero. In that case, then, 
we may say that the emission electron moves in a field which is 
Newtonian outside of a certain sphere (boundary of the core) and 

According to wave mechanics, this idealization now appears less arbitrary 
than one might think. In fact, very often the Schrodinger ^-function corre¬ 
sponding to the electrons of the core is such that or mean electric charge 
density, is independent of time and is spherically symmetrical. 
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non-Newtonian inside. The law of variation of the field within 
the core depends on the law with which the electric charge density 
is assumed to be distributed there. It may be idealized in various 
ways, according to the degree of approximation desired. 

The calculation of the energy levels of an atom with several 
electrons therefore reduces, in this approximation, to the calculation 
of the energy levels of a single electron in a non-Newtonian field. 
This problem will be attacked in a manner analogous to that of 
hydrogenlike systems. The ordinary mechanics of central motions 
will allow us to determine a continuous infinity of motions. It 



turns out that these motions may, to a first approximation, be con¬ 
sidered as Keplerian motions upon which there is superimposed a 
slow, uniform rotation, called 'precession^ so that the trajectory 
has the shape of Fig. 44 and is called a rosette. Then imposing 
the three Sommerfeld conditions, we find the quantum orbits, 
which in this case also happen to depend upon three integers: an 
azimuthal quantum number A; (= 1, 2, . . .) or Z ( = 0, 1, 2, . . .) 
which measures the angular momentum in units of ^ magnetic 
quantum number m ( = 0, ± 1, . . . ±Z) expressing the projection 
of this angular momentum upon an axis, and a principal quantum 
number n. In general, however, circumstances will not be such 
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that the energy will depend only on n, as in the hydrogenlike 
systems, but rather will depend on n and I (not on m, by reason of 
the spherical symmetry of the field^o)^ and therefore the energy 
levels as well as the spectroscopic terms will constitute a series with 
double index rather than with a single index. That is, we shall 
write Eni, Tni. This is the most important result of these consider¬ 
ations and retains its value even in wave mechanics. It may be 
added that, by judiciously specifying the field created by the core, 
Sommerfeld has succeeded in finding the Rydberg form for spectral 
terms, and in a further approximation, the Ritz form.^^ It is now 
evident that if the orbit of the emission electron were lying entirely 
outside the core, where the field is Newtonian {nonpenetrating orbit), 
we should again find the energy levels of hydrogen and hence the 
Balmer terms. In this way there is justification of the experi¬ 
mental fact that the more the terms of any atom or ion in general 
approach the Balmer form, the higher are their quantum numbers. 

In spectroscopy it has become conventional to indicate the 
spectral terms corresponding to the energy levels of an electron 
by means of a small letter designating the value of the azimuthal 
quantum number I, preceded by a number indicating the principal 
quantum number, rather than by the notation The azimuthal 
quantum number is designated by means of the following letters, 
derived from the terminology of the old spectroscopy (we shall also 
give the value of fc, the azimuthal quantum number in the Sommer¬ 
feld theory): 

/ = 0 1 2 3 4 5... 

A: = 1 2 3 4 5 6 . . . 

Letter used: 5 p d f g A . . . 

(Terms with higher I rarely occur.) Hence, for example, the term 
for which n = 3 and Z = 0 is indicated by 3s rather than by tso, 
and we speak of terms of the s-series, of the p-series, and so on, or 
also of s-terms, p-terms, and so on. It is apparent that since always 
n > Z + 1, the s-series starts with the term Is, the p-series with the 
term 2p, the d-series with the term 3d, and so on. 

The energy will depend on m when the atom finds itself in a magnetic 
field of an intensity sufficient to perturb the motion. The Zeeman effect is 
then produced. 

** See No. 18 of the Bibliography. 
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Graphically, the levels corresponding to the terms are usually 
represented in different columns: one for the s-series, one for the 
p-series, and so on, as shown on the right of Fig. 45. 

In hydrogenlike systems, the levels of the various columns would 
all coincide (in our approximate treatment) and therefore are repre¬ 
sented in a single column, as in Fig. 8 or on the left of Fig. 45. 

In the atoms and ions which are not hydrogenlike, the atoms 
which jump from a state of principal quantum number n to one of 
principal quantum number n' do not all emit the same spectral 
line, but different lines according to the initial and final values of 

Batmer 


terms 

Rydberg terms 

4- 
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the azimuthal quantum number h. In the case of hydrogenlike 
systems these lines coincide, and the particular simplicity of these 
spectra results from this fact. However, the coincidence is not 
perfect, as we shall see later on. 

60. Note on the relativistic theory of hydrogenlike systems. 

The motion of the emission electron takes place with a velocity of 
the order of 10^ cm/sec, that is, of the order of a few thousandths 
of the velocity of light, as may be easily calculated. Hence ordinary 
mechanics may be applied to the electron only approximately, and 
it is to be expected that a more rigorous treatment, using relativistic 
dynamics, will apply corrections to the previously found results 
which are not entirely negligible. 

If we repeat the calculations of §56 for hydrogenlike systems 
but make use of relativistic mechanics instead of ordinary mechan¬ 
ics,^^ we find first of all that the electron describes an orbit in the 
** See, for instance, No. 18 of the Bibliography. 
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shape of a rosette; that is, it describes an ellipse with one focus at 
the nucleus, according to Kepler^s laws. This ellipse, however, 
rather than being stationary, rotates slowly about the nucleus. 
The effect of the relativistic correction on the orbit is therefore 
analogous to the effect which, for nonhydrogenlike atoms, is due to 
the fact that the field of force is not exactly Newtonian. Hence the 
orbit is no longer a closed curve, nor the motion periodic. Conse¬ 
quently, the relativistic perturbation partially removes the degener¬ 
acy of the system, as has been pointed out in §56. As far as the 
energy levels are concerned, we find the following approximate 
expression: 

2T^Z^e^m 1 ir^Z^ehn 1 (S 6 \ 

n^\k np 

or else, if we recall the expression for the Rydberg constant, found 
in §10 of Part I, and introduce the constant^® (called the fine- 
structure constant)^ 

“ - ¥ - - 13^ 

and the last expression becomes 

(341) 

Neglecting the term in a^, we have here the expression already 
found in the nonrelativistic theory. This term represents a cor¬ 
rection whose ratio to the principal term is of the order of or a 
few hundred-thousandths. This correction, as we see, is dependent 
not only on n but also on k. Hence all the orbits having the same 
principal quantum number n and azimuthal quantum numbers 
fc = 1, 2, . . . n (that is, the ellipses with the same major axis and 
different eccentricities) do not exactly correspond to the same 
energy level, as was found in a first approximation, but to different 
levels, which lie very close together. Therefore we speak of 
muUiplet levels. Of course to this multiplicity of levels there corre- 

This quantity, which occurs also in connection with the spinning electron, 
has the dimensions of a pure number and is equal, as may be easily verified, 
to the ratio of the electron velocity in the first circular orbit to the velocity of 
light. The numerical value given here is quoted by R. T. Birge, Phys. Rev, 79, 
193 (1950). 
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sponds a multiplicity of spectral lines. The atoms which jump 
from a state of principal quantum number n, with various values 
of A:, to a state of principal quantum number n\ with different 
values of A:', emit lines wliich do not coincide exactly. Therefore, 
strictly speaking, the same effect occurs in hydrogenlike systems 
as has been discussed in the previous section for more complex 
atoms, although for a different reason. The frequeru^y differences 
in question are so small, however, that they may be detected only 
with instruments of high resolving power and only if the lines are 
very narrow. I'lierefore the group of ver};^ closely spaced lines 
will be considered as a ‘‘line” composed of several “components” 
(fine-structure ). 

Formula (341) permits us to calculate the frequencies of the 
various components of each line. Here we shall restrict ourselves 
to noting that the frequency differences are of the order of 
and therefore must be 10 times as large in the spectrum of ionized 
helium (Z — 2) as in the corresponding hydrogen lines. This 
fact, together with the circumstance that it is easier to obtain 
narrow lines with helium than with hydrogen, is the reason why 
the fine-structure is easier to observe in helium. 

The predictions which are made by formula (341) (supplemented 
by a selection rule which will be stated in §04) agree almost exactly 
with the results of observations carried out with helium and with 
hydrogen. Nevertheless, this coincidence must be considered for¬ 
tuitous to a certain extent. 

In fact, the preceding theory neglects two classes of facts. First, 
the Bohr-Sommerfeld model represents only a first approximation; 
secondly, it neglects the influence of the electron spin. In Chap¬ 
ter 14 it will be seen that when these two errors are eliminated, 
by using the relativistic wave mechanics of Dirac (which implicitly 
accounts for spin), the result obtained is fairly close to the one in 
formula (341) (and exactly confirmed by experiment), so that the 
partial success of (341) is due to the fact that the errors depending 
upon the two above-mentioned causes almost exactly cancel each 
other. 

61. The Bohr magneton. An important consequence of the 
Rutherford model of the atom is that an atom must in general 
possess a magnetic moment because of the orbital motion of the 
electrons, whose trajectories are equivalent to as many electric 
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circuits. On the basis of this model, let us calculate the magnetic 
moment produced by the motion of the electron in the case of 
hydrogenlike systems. 

To make this calculation we note that if r is designated as the 
period of the Keplerian motion, the charge e passes 1/r times per 
second through any point of the trajectory. Tliis motion is as 
if the ellipse were traversed by a current of intensity e/r. This 
ellipse, according to a well-known principle of electromagnetic 
theory, is equivalent to a magnetic shell whose potential (magnetic 
moment per unit area) is (l/c)(6/r). Hence, if the area of the 
ellipse is called <S, its magnetic moment is 



Now, by the law of areas, 

1 M 
dS - . dco = dt, 

2 2m ^ 


and hence, by integration over a whole period, 

Substituting into (342), we see that the magnetic moment /x 
t urns out to be related in magnitude to the angular momentum M by 


M 


2mc 


M. 


(344) 


If we then consider that the vectors y and M are perpendicular 
to the orbital plane and are directed in opposite directions (as may 
be readily seen), we may also write the vector relation 


V = 



(344') 


where e stands for the electronic charge in absolute value. This 
result may be extended to systems with as many electrons as 
desired. 

We now recall that M is always an integral multiple of /i/27r, 
according to (329) or preferably (329'), so that we may write 


, eh 

^ 47rmc^ 
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and we find that the magnetic moment due to the orbital motion 
of electrons is always an integral multiple of 


_ e/i 
iTinc 


(345) 


Tliis elementary magnetic moment is called the Bohr magneton. 
Its value, in electromagnetic c.g.s. units, is 

MO = 0.92736 X 10-20. 

Moreover, it is apparent that, because of the result found in 
§56' (spatial quantization) which may be immediately extended to 
nonhydrogenlike systems, an atom placed in a magnetic field orients 
itself in such a way that the component of M along the direction 
of the magnetic field is m {h/2T), where m is an integer which we have 
called magnetic quantum number. From (344') we may then see 
that the component of the magnetic moment in the direction of 
the field will also be quantized; that is, 

fiff = m^o- (346) 

Stern and Gerlach have given a remarkable experimental demon¬ 
stration of this fact, which has made it possible to measure the 
magnetic moments of different atoms directly. 

Incidentally, it is to be noted that the result (346) may also be 
obtained by wave mechanics, as has been shown by Fermi. In 
fact, we may use expression (137') from §31 for the average electric 
current density j and substitute for u the expression found in §46 
for the electron under the action of a central field of force, with the 
result 

V 27r 

where R and 0 are two functions whose exact form is not of interest; 
it suflSces to mention that they are real and normalized according to 
formulas (244) and (252). We readily find that the components 
of j along the radius vector and along the meridian are zero, whereas 
the component along increasing (p is 

_ 

4.Tim y d<p ^ d<p J r sin $ ^ir^m r sin ^ 

** See, for example No. 27 of the Bibliography, page 168. 
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The surface element r dr dd of the meridian plane is therefore 
traversed by a current of intensity j^r dr dd, which describes a 
circle of radius r sin 6 and is thus equivalent to a magnetic shell 
of magnetic moment 

TT 

duH = “ ^ dr dd. 


Integrating over the entire half-plane of the meridian, we obtain 
the total magnetic moment in the direction of the polar axis, 
which is 




eh 

Attvic 


m 


J R^r^ dr J 0^ sin 6 dd; 


and since, by virtue of the quoted normalization conditions, the 
two integrals are equal to 1, again we find, recalling (345), the 
result of (346). 

62. The spinning electron and the inner quantum number. The 

considerations of the preceding section led for a long time to the 
belief that all phenomena depending upon the magnetic moment 
of the atoms (such as paramagnetism and ferromagnetism) could 
be explained by the moment due to the orbital motions of the 
electrons. But there are many phenomena (Zeeman effect, gyro- 
magnetic effects) which permit the determination of the ratio tJ^/M 
of the magnetic moment to the mechanical moment (that is, the 
angular momentum) of the atom. According to the preceding 
section, this ratio should prove to be constant and equal to e/2mc. 
Instead it was found that this gyromagnetic ratio sometimes has 
different values, but is always a simple multiple of e/2mc. The 
most obvious explanation is that some part of the atom has a 
mechanical and a magnetic moment which are of an origin different 
from that of the orbital motion, and that the ratio of these moments 
is different from the one mentioned above. It has been found that 
these difficulties disappear if it is assumed that each electron 
possesses an intrinsic angular momentum (spin) equal to a half 
quantum unit, namely, 


Mo 


lA 

2 27r 


(as if it rotates about itself like a top), and an intrinsic magnetic 
moment pointing in the opposite direction and having the value of 
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one Bohr magneton, namely, 

_ e/i 
^wmc 

This is the hypothesis of the spinning electron, which we have 
mentioned in §25 of Part I, where the reason for this improper 
designation has been explained. It is apparent that the ratio 
/xo/Mo is twice the corresponding ratio relative to orbital motion. 

Of course the orientation of the axis of the spinning electron 
in a magnetic field cannot be deduced from the preceding hypothe¬ 
sis. It requires a new hypothesis, which is suggested by the result 
of §56 on the spatial quantization of the orbits, and represents the 
natural generalization thereof. In order to state the spatial quanti¬ 
zation rule in its most general form, let us consider a system in 
which the total angular momentum is represented by a vector 
j. Using quantum units, as we shall always do in this section, 
j will be an integer or a half-integer (that is, an integer plus 
We shall assume that this system,, when placed in a magnetic field, 
will orient itself in such a way that the projection of j along the direc¬ 
tion of the field will have one of the 2j + 1 discrete values lying 
between — j and +j {extremes included) and spaced apart by unity; 
that is, 


-i, -(i- 1), . . . (i- i),i. (347) 

The extreme values evidently correspond, in the intuitive model, 
to j antiparallel or parallel to the field. Naturally, if j is an integer, 
the values of the series (347) will be integers. If j is a half-integer, 
they will be half-integers. 

This rule, when applied to the orbital angular momentum of 
the electron, brings us back to the result of §56. If it is applied 
instead to the angular momentum due to its spin alone, in which 
case j = 3^, it evidently yields the following result: the projection 
of the spin in the direction of the field can have only the two values 
In other words, the electron spin is always either parallel or 
antiparallel to the held. 

Chapter 15 explains how the values for Ifo and /xo and their 
projection on the field direction may be deduced from quantum 
mechanics without the introduction of new hypotheses. For the 
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present, however, we shall assume these values as hypotheses justi¬ 
fied a posteriori by experience. 

It is now apparent that the electron of hydrogenlike systems, 
because of its orbital motion, finds itself in a magnetic field perpen¬ 
dicular to the plane of the orbit. In order to understand this 
concept, we should think of an observer who travels with the elec¬ 
tron; he would see the nucleus revolving about the electron and 
hence (since an electric charge in motion is equivalent to a current) 
would see himself encircled by an electric current, and therefore 
located in a magnetic field perpendicular to the plane of that 
current.26 Because of the effect of this field, the spin will always 
orient itself normally to the orbital plane, in one of two directions, 
and hence the intrinsic angular momentum of the electron will be 
added to (or subtractcnl from) the angular momentum of the orbital 
motion, which is equal to I in our approximation. The resultant 
angular momentum vill therefore be 

j = Z ± ^ if Z 7=^ 0 (p, d, /, orbits), 
j = if Z = 0 (s orbit). 

In order to distinguish the two states of the atom corresponding 
to the two possible spin orientations, it suffices to add to the indica¬ 
tion of the three (pianturn numbers n, Z, m, which characterize the 
orbit, the indication of the resultant angular momentum, which is 
denoted by j, expressed in quantum units, and is called inner 
quantum number. In the case we are considering, this number 
may take on only the two (half-integral) values j = Z ± if 
Z 0, and j = if Z == 0. In the latter case, furthermore, we are 
not to consider the two spin orientations as corresponding to two 
distinct states of the atom, because, since the vector M (and conse¬ 
quently also the magnetic field) is zero, the two orientations which 
the spin may assume are not physically distinct. 

It will further be seen that the ‘^vector moder^ of which we shall make 
use (vector susceptible of discrete orientations) does not represent the prop¬ 
erties of spin at all well. Nevertheless, like all models, it is very useful in 
aiding intuition and in simplifying the terminology. 

We can reach the same conclusion from the theory of relativity, in which 
it will be remembered that if an electric field exists in a certain frame of reference 
that is assumed to be fixed, then an electric and a magnetic field will exist 
in another system that is in motion with respect to the first. In our case, the 
electric field of the fixed system is the one produced by the nucleus. 
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All that has been said applies not only to hydrogenlike systems 
but also to the emission electron of the alkali metals, because in 
these atoms the magnetic fields produced by the Z — 1 electrons 
of the atomic core cancel each other. However, in other atoms the 
spins and magnetic fields produced by the electrons (or by some 
of the electrons) of the core must also be considered, and hence the 
possibilities are more numerous. For these arguments we refer to 
a volume on spectroscopy (see footnote 18 on page 252). We 
confine ourselves here to stating that the compound angular momen¬ 
tum of all the electrons of the atom (due either to their orbital 
motion or to their spin) is a constant vector whose magnitude is 
generally expressed (in quantum units) by an integer or half¬ 
integer J, which is called inner quantum number of the atom. This 
number is identical with the inner quantum number j of the emission 
electron in the case of the alkali metals and hydrogenlike systems. 

Now turning to hydrogenlike systems and to alkali metals, we 
note that to the two values of the inner quantum number j there 
correspond two slightly different energy levels. In fact, by likening 
the electron to a magnet of moment vo, we recognize that the latter, 
finding itself in a field H due to the effect of its orbital motion, 
possesses a potential energy —|io • H. Hence to the two orienta¬ 
tions which the spin may assume there correspond the two values 

Cl = /loJ?, 62 = —Mo/?, 

of the magnetic energy, which is to be added to the kinetic energy 
and the potential energy of the atom. We have indicated by H the 
average value of the field H along the orbit. However, a more 
thorough examination, taking into account the theory of relativity, 
shows that the preceding values must be cut in half.^^ By means 
of a calculation not given here, we find, making use of the known 
expressions for the Rydberg constant li and the fine-structure con¬ 
stant a of formula (340), 


1 

2 nH^ 



Rhc 


(348) 


Every energy level is therefore split into two neighboring levels by 
the effect of the spin. Their difference is of the order of a^Z^RhCy 

*^See L. H. Thomas, Phil Mag, 8, 1 (1927); J. Frenkel, Zeits.f, Physik 87, 
243 (1926). 
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like the one produced by the relativistic correction [see (341)]. We 
shall not insist upon the quantitative aspects of this theory, how¬ 
ever, since quantum mechanics appreciably modifies formulas (348), 
as we shall see. 

63. The correspondence principle. We shall now make a com¬ 
parison between the spectrum which a system emits according to 
the Bohr-Sommerfeld theory {quantum-mechanical spectrum) and 
the spectrum (which we shall call classical) which it would emit if, 
though able to exist only in quantum states, it were to radiate, in 
each of these states, in accordance with the ordinary laws of electro¬ 
magnetic theory. First, then, we shall see what would be the 
emitted frequencies in the latter case (which is, of course, entirely 
fictitious). 

We recall that according to the laws of electromagnetic theory, 
the emission of radiation by a system of electric charges is deter¬ 
mined by the variation of its electric moment,’^ which is a function 
of the generali23ed coordinates g, which in turn are periodic functions 
of time, each with a frequency Vi == l/Ti, It follows that each of 
the components X, Y, Z of the electric moment may be developed 
in a multiple Fourier series, of the form 

00 

X{t) = ^ (349) 

Tl,TJ . . . T/ 


and similarly for the Y and Z components. 

In fact, let us for an instant consider X to be a function of the coordinate 
qi alone, and let us keep the other coordinates constant. X will then be a 
periodic function of time, with frequency vij and may be developed in a 
simple Fourier series, 

00 

71 * 1—00 

where the coefficients are functions of 92 , 58 , . . . 5 /. We may now 
apply the same procedure to each of these coefficients, considering them 
to be functions of 52 only and leaving 53 , . . . 5 / constant. We then will 
have 
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and hence, substituting into the expression for A'', the latter becomes the 
double series 


00 



where the coefficients hj ^ depend on ^ 3 , . . . q/. 

Applying the same process successively, we evidently obtain the 
expansion (349). 

Now if the components of the electric moment can be decom¬ 
posed into a sum of sinusoidal terms, each component of the electric 
field and the magnetic field which it generates may also be decom¬ 
posed in the same way. Hence the radiation emitted by the system 
in the state in question consists of the superposition of mono¬ 
chromatic radiations, whose frequencies (which we shall indicate 
by Vc\) are given by all the values taken on by the expression 

Vo\ = TiVi + 72^2 + • • • + TfVfj (350) 

where we assign to ti, t 2 , . . . r/ all integral values (positive or 
negative) which do not make the right-hand side negative. The 
intensity of each of these monochromatic components is propor¬ 
tional to 

(^ TlTi ■ ■ -Tf r+iB TlTi • • • T/ )2 + (a,(351) 

where we have designated by -Btit,, •.. t/ and . • t/ the coefficients 
of the expansion of Y and Z analogous to (349). In particular, if 
for some of the groups of values rir 2 , . . . r/ all three coefficients 
Ay By C vanish, the corresponding monochromatic component will 
be missing from the emitted radiation. 

The fictitious spectrum which we have agreed to call classical 
is therefore composed of lines which are specified by two groups 
of indices: ni, 712 , .. . n/ (which define the state of the system, 
and hence the frequencies viy v^y . , . vj) and ri, 72 , . . . 7 /. 
Therefore, to each line of the classical spectrum we may make 
correspond a line of the quantum spectrum, namely, that line which 
is emitted in the transition from the state with quantum numbers 
ni, n 2 , . . . Uf to the state with quantum numbers ni — 71 , — 72 , 

. . , Uf -- Tf, We shall now ascertain the frequency of this line 
when it is calculated according to the quantum theory. 

To do this, we first recall from rational mechanics (see §52) that 
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the energy of a multiply periodic system may be expressed in 
the form 

£(Ji, J2, . . . //), (352) 

that is, as a function of the phase integrals Ji (which replace the / 
constants of integration a^). It may then be shown that the par¬ 
tial derivatives of this function are equal to the frequencies Vi corre¬ 
sponding to the several degrees of freedom: 

^ (t = 1, 2, . . . /). (353) 

With this assumption, in the quantum jump mentioned above 
the phase integrals w^ll go from the values Ji = nji to the values 
(n* — T^h. Hence the energy emitted is 

= E(J 1, t/2, . . • •//) ~ E(J 1 -- tiA, t/2 — T2/1, . . . t// •— Tfh) \ 


or, when the mean-value theorem is applied. 


. jP dE , , dE r I j_ , 

AE = + • • • + 


where the bars are to indica e that the derivatives refer to a state 
(not quantized) intermediate between the initial and final states. 
Keeping (353) in mind, we may write, with similar meaning for 
the bars, 

AE = viTih -h v^T^h 4“ * ’ * VfTfh, 


Hence the frequency emitted in the quantum jump in question is 


VlTi + V2T2 + 


+ 


A comparison of this formula with (350) shows that if the initial 
and final states differ slightly (that is, if the numbers u are small 
compared with the n,, implying that the Ui must be large), the 
frequencies Vi very closely approach the frequencies Vi correspond¬ 
ing to the initial state, and hence the frequency Vqu of the line of the 
quantum spectrum becomes approximately equal to the frequency 
of the corresponding line in the classical spectrum. 

Therefore we should say that in the limit of high quantum num-^ 
herSy the corresponding^* lines (in the sense explained above) have 
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the same frequency. This theorem, which we have derived from 
the fundamental postulates of the Bohr-Sommerfeld theory, proves 
that this theory satisfies (as far as the prediction of spectral fre¬ 
quencies is concerned) the essential condition required of any 
atomic mechanics, namely, to have ordinary mechanics and electro¬ 
magnetic theory for a limit. 

All that has now been said refers only to the frequencies of 
spectral lines, not to their intensities or states of polarization. 
These properties, though obtainable from the coefficients of the 
Fourier development for the classical spectrum, are indeterminate 
in the quantum theory of Sommerfeld. Bohr has corrected this 
deficiency partially by the addition of a new postulate to the theory, 
suggested by the consideration that the requirement mentioned 
above must be fulfilled not only with regard to the frequency but 
also to the other properties of the emitted radiation. This postu¬ 
late, known by the name of correspondence principle, is as follows: 
The corresponding lines (in the sense explained above) have similar 
intensities and similar states of polarization, and the similarity tends 
toward identity with increasing quantum numbers. Hence, calcu¬ 
lating the intensities and conditions of polarization of the lines 
of the classical spectrum by the methods of electromagnetic theory, 
we obtain an approximate indication concerning the analogous 
properties of the actually emitted lines. For this reason, the inten¬ 
sity of the hne of frequency (354) must, according to the corre¬ 
spondence principle, be approximately proportional to expres¬ 
sion (351). 

This principle, in spite of its merely qualitative character, has 
furnished very remarkable results and has been of fundamental 
importance in the development of the quantum theory, especially 
since in certain cases (as we shall see in the following section) it has 
given qualitative as well as quantitative results. We may add 
that the present theory of radiation, based on quantum mechanics, 
not only justifies the principle enunciated above but also replaces 
it by an exact evaluation of the intensity and state of polarization 

It is apparent that the theorem holds to the same approximation also 
when the line of the quantum spectrum, of frequency (354), is made to corre¬ 
spond to the line of the classical spectrum having the same indices r, but in 
which the vi refer to the final rather than to the initial state (or to any inter¬ 
mediate state). Hence there is a certain arbitrariness in the choice of the 
correspondence criterion. 
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of the lines, which is found to be in agreement with experimental 
results (see §32). 

64. The selection principle. One of the most important appli¬ 
cations of the correspondence principle obtains in the case in 
which, in the classical spectrum, the intensities of the lines with 
indices Ti are zero, which would be emitted in the state with quan¬ 
tum numbers rii, in the state of quantum numbers (n^ — r^), or in all 
intermediate states. Then the quantum line emitted in the transi¬ 
tion Ui —> rii — Ti always corresponds to a classical line of zero 
intensity, no matter what the criterion adopted to define the corre¬ 
spondence. Hence we may conclude that its intensity is zero, that 
is, that this quantum jump cannot occur, or is forbidden. As may 
be seen, this is one of the cases in which the correspondence principle 
leads to a precise—not merely qualitative—statement, which is 
found to be confirmed by experiment. In this particular case the 
correspondence principle takes the name of selection principle. 

We shall add that when in the classical spectrum, in the initial 
state, in the final state or intermediate states, a single one (or two) 
of the components X, F, Z of the electric moment corresponding 
to a given line is zero, certain conclusions may be drawn concerning 
the state of polarization of the quantum radiation, as will appear 
from the following examples. 

We shall now see some applications of the selection principle. 

(a) Harmonic oscillator. The electric moment in this case has 
only an A^-component, given (if the moving particle carries a 
charge e) by 


X = ex = eA sin (flirv^t — ip) 


— fptV p2vivQt 

2i ® 




The Fourier expansion therefore reduces to the two terms of 
frequency vo, and hence all the terms whose index r is not equal to 
±1 are missing. Hence only the transitions in which n varies 
by ±1 are possible, and all others are forbidden. We note that, 
since i^n == (^ + /'^)A^o, the frequency which is emitted (or absorbed) 
in these transitions is vo; that is, the quantum spectrum reduces to a 
single line whose frequency is exactly equal to that of the classical 
spectrum or to the proper frequency of the oscillator. 

Since the F- and Z-components of the electric moment are then 
evidently always zero, the light emitted in the quantum jump will 
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always be polarized along the a:-axis, that is, parallel to the direction 
of oscillation. 

(b) Rigid rotator. Let us consider a rigid rotator containing 
electric charges Oi located at distances r, from the axis. If the 
axis of rotation is taken along the z-axis, the components of the 
electric moment are 


X = ^ Ciri cos {2TrvQt ■— <pi), 

i 

y = ^ Ciri sin {2'KVQt — 

Z = 0, 


where vq is the frequency of the rotation and the (pi are constants. 
The summations contain as many terms as there are charges. To 
the first two components we may apply the consideration already 
mentioned for the oscillator; we find that since only the frequency 
vq is present in the Fourier development, only those quantum jumps 
are possible for which n changes by ± 1. Keeping expression (315) 
for the energy levels in mind, we see that the frequency emitted 
in the transition n —> n — 1 is 

(2n + l)^ (355) 

Therefore the spectrum emitted consists of an infinite number 
of equidistant lines (on the frequency scale). This result is of 
considerable importance for the theory of band spectra. 

In regard to the state of polarization: since the electric moment 
rotates in the a:i/-plane, each line in the classical spectrum would 
appear circularly polarized if the light is observed along the z~axis, 
and plane-polarized (parallel to the :r^-plane) if the light is observed 
in a direction perpendicular to the z-axis, whereas elliptically 
polarized light would be observed along intermediate directions. 
Since this condition holds good for all the lines of the classical 
spectrum, we may say as much for the actually emitted lines. 

(c) Selection rule for the azimuthal quantum number. Let us 
now apply the selection principle to the central motion of an emis¬ 
sion electron, assuming that the field generated by the core is 
sufficiently close to the Newtonian type that we may decompose the 
motion into a Keplerian motion plus a uniform precession, as was 
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done in §59. What we shall have to say applies particularly to 
hydrogenlike systems in which the precession is due solely to a 
slight relativistic correction. 

Let us assume a system of Cartesian coordinates with the x- and 
2 /“axes in the (fixed) plane of the orbit. Their relation to the polar 
coordinates r, co may be summarized in the formula 

X + iy — r e*". (356) 

Now we note that because of the precessional motion, r and co do 
not have the same period. Indicating by Vr and the respective 
frequencies, and by e the frequency of the slow precession, we have 


Vo, = Vr + e. 


(357) 


We now introduce a system of axes x\ y'^ which rotate in such 
a w^ay as to accompany the motion of precession. With respect 
to these axes the motion will be periodic and hence the angle cc' 
between the radius vector and the axis x' will be, like r, a periodic 
function of frequency Vr, so that we may write the Fourier 
development 


r 





(358) 


Since oj = co' + 27r€^, 

Ave obtain from (356) and (358), 

00 

X + iy T ^ Ar 

T= — 00 

or else, solving for e in (357) and putting t. = t — 1 , 


(359) 


X iy = ^ Ar, 

Tr — — «0 

In order to obtain the expansions of x and y separately, it 
suffices to write the complex conjugate expression of the last equa¬ 
tion and to perform an addition and subtraction. We evidently 
find two series both containing terms of the type 
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We have thus obtained the expansion of the coordinates x and y 
(and hence of the components of the electric moment) in the form 
(349). We recognize that whereas Tt may take on all integral 
values, the coefficient of assumes only the values ±1. Then, 
appl 3 dng the selection principle and recalling that the quantum 
number related to the coordinate co is the azimuthal quantum num¬ 
ber Z, we see that only those quantum jumps are possible in which the 
azimuthal quantum number I (and hence k) changes by ±\, We shall 
express this by writing 


AZ=±1. (360) 

This is the selection rule for the azimuthal quantum number, 
which we have already found in §50 by wave mechanics and which 
is of fundamental importance. In fact, with reference to the term 
scheme represented in Fig. 45, this rule says that transitions are 
possible only between adjacent columns of the scheme; this restric¬ 
tion reduces the complication of the spectra enormously. In other 
words, the 5-terms may combine only with p-terms, the p-terms 
only with 5- or rf-terms, and so on. 

On the other hand, there is no limitation on the changes of the 
radial quantum number nr, and hence none on the changes of the 
principal quantum number n. 

It is to be remembered, however, that the preceding reasoning 
holds under the hypothesis that the motion of the electron differs 
very little from Keplerian motion, and hence exceptions to the rule 
are to be expected in the case of terms which are strikingly different 
from the Balmer terms, and when the atom is subject to strong 
perturbations (electric or magnetic fields). Actually, in these cases 
forbidden lines are often observed in the spectrum, but they are of 
weak intensity, as is to be expected (see footnote 8 on page 226). 

(d) Selection rules for the magnetic quantum number and for the 
inner quantum number. An argument similar to the preceding may 
be made for the magnetic quantum number. For this number to 
intervene, however, it is necessary to suppose that the atom is 
situated in a magnetic field, no matter how w^eak, so that a privileged 
direction in space is established (see §53), for which we take the 
z-axis. The magnetic field produces a slow precession about this 
axis (Larmor theorem). It may then be shown, in a manner 
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similar to (c) above, that for the magnetic quantum number m the 
selection rule 

Am = 0, +1 

holds, which we have already proved in §50 by means of wave 
mechanics. In addition, we find again the rules for polarization 
stated in §50. 

Finally, for the inner quantum number j we find, by similar 
considerations, the same selection rule 

Aj = 0, ±1, 

with the additional rule, however, that the transition from j = 0 to 
j = 0 is forbidden. 




GENERAL METHODS OF 
QUANTUM MECHANICS 




CHAPTER 10 
Mathematical Introduction 

1. Function space. It is often convenient to designate an 
ensemble of N numbers/i, / 2 , . . . Jn by a point P in an iV-dimen- 
sional space (referred to Cartesian axes numbered from 1 to N)j 
or else by a vector f = OP in the same space, calling 0 the origin of 
the axes.^ It is then possible, by an obvious generalization of 
(elementary concepts, to dehne the magnitude or modulus 1 f 1 of the 
vector f by means of the formula 

\i\-n + n+ ■ ■ ■ +n-, 

the scalar product of two vectors f, g may bo defined by: 

f • g == /l(7l + /2^2 + * • * + /jV(7Ar. 

Also, such operations as sum and difference may be defined in 
an ol)vious manner. 

If we wish to extend these considerations to the case where 
/i, / 2 , . . . Jn are compk^x numbers, it is advantageous to replace 
the preceding formulas by the following, which, in the case where 
the numbers are real, reduce to the original formulas (as usual, the 
asterisk denotes the complex conjugate): 

1 f 1 = v7jf+M”+ • • • (1) 

f • g = /i^f + fifjt + • • • + fNQi- (2) 

Hence we can say that to specify a vector in the A/'-dimensional 
space means to make a (real or complex) number /r, which is the 
rth component of the vector, correspond to every integer r (from 1 
to AT). 

We now wish to extend these considerations by introducing a 
space with an infinite number of dimensions. In order to do this, 

^ In this whole chapter we shall deal only with vectors starting at the origin. 
Therefore to every point there corresponds a vector, and vice versa. 
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we consider, instead of the N values of the index r from 1 to N, the 
infinite number of values which a real variable x may take on in 
an interval (a, ?>). We can say that instead of N axes, we are con¬ 
sidering a {continuous) infinity of coordinate axes, each corre¬ 
sponding to a value of x. To assign a vector f to this space then 
means to make a (real or complex) number correspond to every 
value of X (between a and h); that is, it amounts to specifying a 
function f{x) which is single-valued over (a, b). This operation 
suggests that we consider any function f{x) which is finite and 
single-valued in an interval (a, b) (possibly infinite) as a vector f in 
a space with an infinity of dimensions, in which any one of the 
values of x between a and b characterizes a coordinate axis. The 
corresponding value taken by the function represents the projection 
of the vector upon that axis (xth component of the vector). 

This space is therefore called function space. We can also say 
that the function f{x) is represented by a point in function space; 
and vice versa, every point of function space represents a function 
f{x). But often the vectorial interpretation is more useful. 

What we have said for a function of one variable x may be 
extended without difficulty to a function of p variables /(xi, X 2 , 

, . , Xp)j finite and single-valued in a certain domain S (possibly 
infinite). The function space will in this case have cop dimensions, 
and every one of its axes will be designated by a group of p numbers. 
The corresponding component of the vector f will be the value 
which the function / assumes corresponding to these values of the 
independent variables. In all that follows we shall refer to a 
function of p variables for greater generality, but we shall often 
indicate the totality of variables by the single letter and shall 
write/(x) instead of/(xi, 0 : 2 , .. . 

To the vectors of function space we may immediately apply the 
usual definitions of sum and difference, as well as that of product 
of a vector by a scalar.^ For example, the mm of the two vectors 
f and g is the vector whose a;th component (or, in the case of several 
variables, the vector whose component along the xij x^, • . . Xp-axis) 
is f{x) + g{x)j that is, the vector representing the function f{x) + 
g{x). The product of the vector f by the constant c is the vector cf 
representing the function cf{x). 

* In these considerations, by a scalar is meant a constant quantity (with 
respect to xi, X 2 , . . . Xp). 
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If then a certain number n of vectors fi, f 2 , . . . fn is given in 
function space, all the vectors obtainable from the latter by means 
of linear combinations with arbitrary (constant) coefficients 

Clfl + ^2!? + * * * + CrSny 

which represent all the functions expressible as linear combinations 
of f 2 {x) . . . , are said to form a linear subspace {variety, 

extent) in n dimensions. This is the natural generalization of the 
concept of plane or of straight line (passing through the origin) in 
ordinary three-dimensional space, which may be thought of as being 
specified by means of two vectors or of one vector, respectively, 
originating from the origin. Of course the same linear subspace 
may be specified in an infinite number of ways, by selecting for its 
definition another group of n vectors belonging to it. 

There remain to be extended the formulas concerning the magni¬ 
tude and the scalar product, for which we have to restrict our 
considerations to a more limited class of functions, as will be shown 
in the following section. 

2. Hilbert space. In order to extend expressions (1), (2), and 
analogous formulas, to the case with an infinite number of dimen¬ 
sions, we must evidently replace the summations from 1 to iV by 
integrals with respect to x from a to & in the case of a single variable, 
and by multiple integrals extended over the whole domain S in 
the general case of p variables. Hence we shall be led to define as 
modulus of the vector f representing the function / (or, as is some¬ 
times said, as norm of the function /) the positive number jf j whose 
square is given by the following formula, analogous to (1): 

ifi^ = mnx) ds. (3) 

Here and henceforth, the generally multiple integral extended over 
the whole domain >S is indicated by a simple integral sign, and the 
volume element of the domain by dS = dxi dx 2 . . . dxp. (In the 
case of a single variable, dS = dx, and the integral is simple.) It is 
apparent that the integral (3) is not always convergent. Hence 
we are led to consider from now on only those functions f for which 
the integral (3) is convergent {functions of integrahle square), that is, 
only for the points of function space for which the distance from 
the origin has a finite and definite value. The ensemble of these 



280 


GENERAL METHODS OF QUANTUM MECHANICS 


[§2 

points, called Hilbert space, constitutes part of function space. We 
shall then call normalized functions^ those whose norm is 1 or those 
represented by a unit vector (or versor); any function of integrable 
square may be normalized by dividing it by its norm. 

Let us now extend formula (2) to Hilbert space. The scalar 
product of two vectors f, g representing the functions jix), g{x), or 
scalar product of the two functions is the quantity 

f • g = ds. (4) 

(It can be shown that the integral is always convergent because of 
the convergence of the integrals which define [fj and |g|.) It is 
apparent that the scalar product in general is not commutative 
(as it is for real vectors). Reversing the order of its factors changes 
it into its complex conjugate. 

It is also to be noted that if c is a constant, 


(cf) • g = • g, (5) 

whereas f • (rg) = • g. (5') 

Definition (3) of the modulus of a vector f or norm of a function/ 
may also be stated as the square root of f • f. 

The projection of the vector f upon the vector g is the quantity 

/„ = f-vers g =/• ill- (6) 

The condition of orthogonality of two vectors f, g [or of the func¬ 
tions/(a:), g{x)] is that f • g = 0, because of definition (4), according 
to which 

ff(x)g*(x) dS = 0. 

This definition of orthogonality between functions has already 
been introduced in §5 of Part II. Now we see the reason for this 
designation. (Note that the condition is independent of the order 
of the two vectors.) 

The expansion of a function in a series of orthogonal functions 
(see §9 of Part II) has an important interpretation in Hilbert space. 

^ It is to be noted that this definition coincides with that already given for 
normalized eigenfunctions in §4 of Part 11. 
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First let us consider the case of a single variable, for simplicity. 
We observe that any of the orthonormal eigenfunctions yn{x) 
(deriving from a differential equation of the type already con¬ 
sidered in §3 of Part II: 

whose eigenvalues we assume to be discrete) is represented by a 
versor yn, and that this infinite number of versors are orthogonal 
to each other. We may therefore say that these versors define, in 
Hilbert space, a system of an infinity of orthogonal coordinate axes 
(one for each value of n), in the same way in which a triplet of 
versors i, j, k in ordinary space defines^ a system of Cartesian axes. 
Now, with the notation (4), relation (32) of Part II may be written 

/n = f • yn (8) 

and is interpreted as follows: the coefficient /n of the expansion of 
the function / in terms of the orthogonal functions is the pro¬ 
jection of the vector f upon the versor yn, or the component of f 
along the nth coordinate axis. Then the expansion [(31), Part II], 
which may be written in vector notation 

00 

f = ^ /»y»> (S') 

n = l 

acquires the same meaning as the ordinary-space relation, 

V = + Vy] + 7.k, 

between a vector V and its components 7v, Vz- 

In Hilbert space, then, we are led to consider, besides the 
original system of axes corresponding to the infinity of values of x 
(which we shall call ^'continuous axes,^* the values of x constituting 
a continuous infinity), another system of axes (infinite in number, 
but discrete), defined by the versors jn (n = 1, 2, . , .) and hence 
depending on the differential equation of which the yn are eigen¬ 
functions. Every quadratically integrable function fix) is speci- 

^ The origin is understood to be fixed once and for all 
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fied^ by assigning values to the coefficients Cn of its development in a 
series of eigenfunctions yn- Consequently, any vector (or any 
point) of Hilbert space may be specified by means of its components 
(or its coordinates) relative to the axes defined by the versors y^, 
rather than by those components relative to the continuous axes 
used in the beginning. Thus Hilbert space possesses a denumerable 
infinity of dimensions, not a continuous infinity like the function 
space of which it is a part. 

For the axes yn, the magnitude |f| of the vector f may be calcu¬ 
lated by means of the formula 

00 

w = 2 /»/:, ( 9 ) 

n = l 

which, by ParsevaPs formula (see page 99), is equivalent to (3). 
Similarly, it may easily be shown that the scalar product f • g may 
be calculated by means of the components fn of f, and Qn of g by 
the formula 

00 

f • g = 2 

n»l 

and hence the orthogonality condition may be written 

00 

2 = 0 . ( 11 ) 

n=» 1 

Let us now proceed to a consideration of the case of p variables. 
What has been said above must now be modified in the sense that 
every eigenfunction is no longer specified by a single index n but 
rather by p indices, so that we must write t/mna. • • np- In this way 
a system of orthogonal coordinate axes in Hilbert space is 
defined, each of which is specified by means of a group of p integers. 
All the preceding formulas will then have to be modified by inserting 

® Indeed, two functions whose values are equal everywhere, except for some 
points X constituting an aggregate of zero measure, evidently have the same 
coefficients Cn. Hence it is useful to consider two such functions as represented 
by the same vector (or point) of Hilbert space. Moreover, f{x) would be 

specified (in the sense explained above) by the Cn even if the series ^ CnVn were 

n 

not convergent absolutely, but only in the mean, in the sense of footnote 5 
on page 99. 
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in the place of n the group ninj . . . Up and by replacing the 
simple summations by multiple summations. For example, expres¬ 
sion (9) becomes 


Ifl^ 


00 

I 


ninz ■ • 
1 


f* 

* * ninz • 


Tip 


(9') 


The formulas thus become somewhat more complicated formally, 
but no new difficulties are introduced. Therefore we shall continue 
to write the formulas for the case p = 1, with the understanding 
that in order to pass to the general case it will suffice to replace 
every index by a group of p indices, and each simple summation by 
a multiple one. 

3. Linear operators. We shall call operator any s 3 ^mbol which, 
when written in front of a function (of one or more variables),® will 
change it into another function of the same variables, according to 
a definite law. Elementary mathematics and the calculus furnish 
some examples, among which are the following: 

(a) Any number k may be regarded as an operator, since when 
it is put before a/(a:, p, . . .), it changes the latter into the product 
kf{Xj ?/,...). Of course this convention holds also when k is itself 
a function. In particular, 1 is an operator which transforms any 
function into itself; it is called identity operator, 

(b) The symbols log, sin, cos, and so on, are all operators which 

change the function/(xi, X 2 , . . .) into the function log/(xi, X 2 y . . .), 
sin/(xi, X 2 , . . so on. 

(c) The symbol d/dx is an operator which changes any differ¬ 
entiable function /(x) into its derivative. Similarl}^ (for functions 
of several variables) the symbols d/dxi, d/dx 2 , and so on, are oper¬ 
ators, and also the symbols of second and third derivatives, and 
so on. 

(d) The symbol / * dxi (with Xo constant) is an operator which 

JXo 

changes any integrable function f{xi, x^, . . . Xp) into the function 
/*‘/(a;i, Xt, . . . Xp)dxi. 

Jxd 

® Sometimes an operator is defined only for certain definite classes of binc- 
tions and has no meaning for others. For example, the operator d/dx has a 
meaning only for differentiable functions. 
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A general operator will be designated by a letter—usually a 
German letter. For instance, we shall write F = ?l/ to indicate 
that the operator ?l applied to the function / changes it into the 
function F, 

Using geometri(‘al terminology, we may say that an oper¬ 
ator defines a correspondence between points (or between vectors) 
of function space. Therefore we sometimes also write F = 3lf, 
for example. We shall always suppose such a one-to-one 
correspondence. 

In what follows we shall deal only with linear operators^ that is, 
those which liave the two properties: 

m + g) = 3r/+3ii7, (a) 

where f and g are any two functions;^ and 

?l(c/) = c?(/, (/3) 

where c is a constant and / is any function. For example, among the 

operators listed above, the operators /c, d/dx, d/dxiy d/dx^j f^'dx, 

Jxo 

are linear, and the operators log, sin, cos, and so on, arc nonlinear. 

In function space a linear operator establishes a correspondence 
between vectors, which is the natural generalization of a vector 
mapping of ordinary three-dimensional space. 

The class of linear operators has properties of great mathe¬ 
matical interest (see No. 33 of the Bibliography). We shall confine 
ourselves here to the essential points of the theory of these operators. 

4. Linear operator algebra. We can define operations of com¬ 
bination among linear operators, analogous to the operations sum, 
difference, and so on, by which algebraic quantities may be com¬ 
bined with each other. Such definition permits the construction 
of a linear operator algebra analogous (although not identical) to 
ordinary algebra. 

Given two linear operators 21 and S3, we call their sum, and 
indicate by 21 + 23, the (linear) operator defined by* 

(2l + 23)/=2l/+S3/. 

^ Of course, provided that they are such that it is meaningful to apply the 
operator 21 to them. This condition will always be implied in what follows. 

® Here and in what follows, / is any function to which the operators in ques¬ 
tion may be applied. 
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In an analogous manner vve define the difference of two linear 
operators, and the sum of any number of them. 


For instance, take the linear operator A, called the Laplaciarij which is 
used a great deal in mathematical physics. It is defined by 


Six, y, z) 


the sum of the three linear operators d^/dy'^, d‘^/dz‘^; that is, we 

may write 


^ T 2 


dx‘^ dy‘ 


+ 




( 12 ) 


Another well-known example is the operator which occurs in the left- 
hand member of the Schrodinger equation (131) of Part II, and may be 
written 


STT^m 


Vt = 0. 


(13) 


The operation of first applying the operator 23 and then apply¬ 
ing, upon the function obtained, the operation 31, is called the 
product of the linear operator 31 by the linear operator 23, and is 
indicated by 3IS, This amounts to saying that 

mj = 31(33/). 

Evidently the product of two linear operators is itself linear. 

In general, the commutative law does not hold for such a product; 
that is, the operator 3133 does not coincide with the operator 3331. 
It is just this property which makes operator algebra different from 
ordinary algebra. When the two linear operators 3133 and 3331 are 
identical, 31 and 33 are said to commute. 

Examples. Two numerical factors k 2 (either constants or not) are 
always commuting operators, since = k 2 k^J. Similarly, the linear 

operators d/dx and d/dy (commute as a rule, if their product in the order 
indicated is {d'^/{dx dy) and in reversed order is d^/dydx. However, the 
two linear operators 

3 f = a: and 

do not commute, because 

ms = 

ax 



( 14 ) 
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The last equality may also be written 

3321/ = / + mf = (1 + 2193)/, 

which means that for the operators 91 and 93 defined by (14), the commuta- 
lion relation 

9321 - 91S - 1 (15) 

holds, instead of the commutative property. 

Obviously, the nth power (with n a positive integer) of a linear 
operator 91 is defined as the product of n factors equal to 91. We 
also adopt the convention 9P = 1. 

d d" 

Example. The nth power of the operator “ is —; that is, 

ox dx^ 



Hence (12) may also be written 

■ (s) + Gi) + GO ■ 

Given a linear operator 91, if there exists a linear operator 93 such 
that 

3158 - 9391 = 1, 

we say that 93 is the reciprocal or the inverse of 91, and vice versa, 
and we write 

58 - 91 = 93-b 

If we apply to a function an operator and its inverse in suc¬ 
cession, the two operations will cancel, and again we have the 
original function. 

If the inverse of 91 exists, we may define the pow’^ers of 21 with 
negative exponent, putting 

Sl-n = (3l-l)n, (10) 

It is clear that for operators the ordinary theorems on powers 
hold; for example, 9l"9l”‘ == 3t«+»» (n, m are positive, zero, or nega¬ 
tive), and so on. 

It is apparent that if both 91 and S possess an inverse, their 
product 2193 will also have one, and it will be 93”^2I~^ (note the 
inversion of factors). In fact, 

(9l93)(93-'9l-0 = 2l(9393--')2l~i =^2121-' == 1 
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The powers of a linear operator 21 having been defined, we may 
define other linear operators called functums of 21, in the following 
manner. Let F(a) be the symbol of a function which can be 
expanded in a power series in the variable a; that is, within a certain 
circle of convergence, let 

F{a) — do diOL -f- d20L^ 

We shall then define as /^(2l) the linear operator obtained by 
replacing bodily in the preceding series the symbol a by the sym¬ 
bol 21. Every term of the series will then acquire the meaning of 
a well-defined operator. 

Example. Let us take for 2f the linear operator \{d/dx) (where X is 
a constant), and let us define the linear operator or Since the 

function €“ is defined by the series 

= 1 + « -f “ of^ + ” + • • * , 


we have, replacing a by \(d/dx), 


^Md/dx) 


dx 21 dx^ 3! dx^ 


The operator on the right has a rather important interpretation: it 
changes/(x) into f{x -f- X), since, by the Taylor formula, 


/ ^ X 4. 4. 4. M x^ 


Hence we shall write 


eUd/dx)f^^) ^ ^ X). 


It is evident that a linear operator 21 commutes with any power 
21”, and hence also with any E(2l). 

One further sees immediately that if an operator 23 commutes 
with 21, it also commutes with any F(2l). 

We now proceed to define a function of several linear operators 
21, 95, S, confining ourselves (for ease of writing) to the case of two. 
Given a function of two variables which may be expanded 

F(a, /5) = ^ Oika^lS^y 
i,k 


(17) 
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the immediate extension of the last case leads us to define F{%, 93) 
by the linear operator 

g = 2 (18) 

However, it must be pointed out that although this definition 
may be adopted without hesitation if and S commute, the case 
where they do not commute requires some attention. In fact, in a 
general term of (17) the order of the factors a and 13 is indifferent, 
so that instead of a^/3* we could write a^~^(3'^a or for exam¬ 

ple, or many other forms. The same function F(a, /?) always corre¬ 
sponds to all these ways of writing. On the other hand, if we replace 
a and (3 by two noncommuting linear operators and 33, we obtain as 
many different linear operators from these series. Therefore it is 
necessary to make the linear operator ^ == F (?l, 33) correspond, not to 
the analytic function F(a, /^), but to a 'particular way of writing 
that function. 

We then immediately realize that a linear operator which is a 
function of one or more commuting linear operators ?I, 33, . . . 
will itself commute with each of these operators. 

6. Representation of a linear operator by a matrix. It is easy 
to show that if 31 is a linear operator which operates upon vectors 
of Hilbert space, then the components of the vector 31f (which wo 
shall indicate by F) are linear combinations of the components of f. 
This is equivalent to saying that any linear operator is equivalent 
to a linear transformation of the components of the vector to 
which it is applied (analogously to what occurs in vector mappings 
in ordinary space). 

In fact, if as usual we call y„ the versors of the axes (eigenfunc¬ 
tions of a differential equation), we may write each vector f in the 
form 

n 

where the /n are the components of f, and the sum is understood to 
be extended from 1 to oo (this will also be implied for the formulas 
which follow). Applying the operator 31, since it is linear, we have 

F = sif - y 


(20) 
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so that it is sufficient to know the effect of the linear operator 21 on 
the fundamental versors yn in order to be able to apply it to any f. 
Now, the vector 2lyn will be specified by its components which we 
shall indicate by Amn (where the first subscript characterizes the 
component and the second specifies the vector with which we are 
dealing). Hence we may write 



2[yn ^ Amn^m* 

m 

(21) 

Substituting into (20), we have 



F fnAmnyrnf 

n 


or else, writing 

Fm ~ ^ A mnf nj 

H 

(22) 

we have 

F = ^ F„y„. 

m 

(20') 


From this sequence we see that the components of the vector 21 
are the Fm, which are obtained from the components fn of f by 
means of the system of (infinitely many) linear relations (22), whose 
coefficients are the Amn. Hence a knowledge of these coefficients 
permits us to obtain for any f the corresponding 2lf, and the ensemble 
of these coefficients completely determines the operator 21. 

The coefficients Amn constitute a double (denumerable) infinity 
of (generally complex) numbers, which may be placed in an array 
{matrix) of an infinite number of rows, characterized by the first 
subscript, and of an infinite number of columns, specified by the 
second subscript: 

I A\\ A 12 -413 • • • I 

A 21 A 22 ^23 * • * I 

A 31 Az2 AzZ ... I 


Thus to every linear operator 21 there corresponds a matrix 
which perfectly specifies it and which is generally indicated by the 
same symbol 21 as the operator. Often, when it is necessary to 
emphasize that we are dealing with the matrix, we shall designate it 
by {SI}. 
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The elements Amn of this matrix may be calculated, with the 
observation that Amn represents the mth component of the vector 
Slyn, and hence [see formula (8)] 

Amn = (3lyn) • Jm* (23) 

Or, recalling (4) and writing the factor with the asterisk first to 
avoid ambiguities, we have 

Amn = / yl'i^Vn dx. (23') 


Example, Let ?I be the identity operator (?f = 1). Then (23) yields 
Amn ~ jVmy^ dx — hmny (24) 

and hence the matrix representing the identity operator is 


10 0 ... 
0 10 ... 
0 0 1 ... 


(25) 


This is called the unit matrix and is indicated by {1). 

Note, In the general case of p variables, every element of the 
matrix must be written in the form Amitm •. • m^ninj. -. np, for reasons 
mentioned in §2; that is, we shall be dealing with a matrix in which 
the rows and columns are characterized, not by two indices, but 
by two groups of p indices. This situation does not introduce 
any conceptual difficulty but merely complicates the writing 
considerably. 

6. Matrix algebra. Since to any linear operator (fixed in the 
system of reference) there corresponds a matrix, and vice versa, 
it is evident that by the operations of sum, difference, and so on, 
defined between linear operators, as many operations between 
matrices are established. For instance, we shall call sum of the 
matrices {21} and {S3}, and shall indicate by {21} + {S3} the matrix 
{21 + S3} —that is, the matrix corresponding to the linear operator 
which is the sum of 21 and S3. The product of two matrices will be 
defined in a similar way, from which also the nth power (n being a 
positive integer) of a matrix can be derived. Similarly, we shall 
define any analytic function F( {21}) of a matrix as the matrix corre¬ 
sponding to the linear operator F(2l), defined in §4; and analogously 
for a function of several matrices. If 2f possesses an inverse 2l~^ 
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the matrices {21} and {2l~^} will be called inverse j or reciprocal of 
each other. Their product (in any order) is the unit matrix (25). 
The matrix {2l”M will be indicated by {21}""^ and its powers will be 
considered as powers of j 21} with negative exponents. 

It is preferable to give a definition which is equivalent to these 
operations between matrices but which is independent of the respec¬ 
tive operators; consequently, we shall define the operations between 
matrices by means of operations to be performed upon their ele¬ 
ments. This is done as follows. 

The elements of the sum matrix {21} + {33} are the sums of 
corresponding elements of the two matrices {21} and {93}; that is, 

A\\ + jBii Ai2 + Bi 2 A\i + Bu * * * 

1*511 4 - 1931 = ' * * 

Azi + Bzi A32 + Bz 2 Azz + Bzi * * * 

From (23) it follows immediately that the matrix sum, defined in 
this way, is actually the matrix corresponding to the operator 
21 + 58. 

The same is true for the difference of two matrices and for the 
sum of any number of matrices. 

The multiplication of a matrix {21} by a constant k is performed 
by multiplying each element of the matrix by k. This statement 
also follows immediately from (23). 

Let us proceed to the product of two matrices 21, 93. Let us 
call ^ the linear operator 

^ = 2193 (26) 

and let us calculate by means of (23) the general element of the 
matrix product {*??} = {21}{S}: 

Pmn = (93yn) ' Jm == (3l93yn) * Ym- (27) 

Let us calculate the factor in parentheses. Because of (21), 
we have 

S3y„ = ^ Bi„yi 

i 

and hence 

^ ^ Bin ^ AkiYk = ^ (^ AwBin) y*. 

i i k k i 
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Substituting into (27) and recalling that the y are orthonormal, 
we have 

Pn,n = 2 (28) 

% 

As a memory aid, note that by writing, as we have done, the two 
factors Ami and Bin in the order corresponding to that of (26), the 
two subscripts m and n occur in the same order in the two terms, 
and the index of summation remains between them. 

Formula (28) may readily be seen to be equivalent to the 
following rule: The product of two matrices is obtained by the 
known rule for the product of two determinants, but always by 
multiplying the rows of the first matrix by the columns of the 
second matrix, and not conversely. 

Of course the product of two matrices is not commutative, except 
for the case in which the two corresponding operators commute, 
whereupon the two matrices are said to commute. 

By applying the preceding rule to the unit matrix (25) and to 
any other matrix {31}, it may readily be verified that 

= mil] = m- 

6a. Representation of a function by means of a matrix with a 
single column. We may give an expressive interpretation to the 
formulas of §5 if we agree to consider the components fn (which 
characterize a function / with respect to a certain system of orthog¬ 
onal functions) to be elements of a matrix with a single column 
(and an infinity of rows). That is, we write 


/i 

/2 



Now not only operators but also functions are represented by 
matrices, and we may readily verify that in order to operate on a 
function / with an operator 31 it is sufficient to form the product of 
the corresponding matrices by the ordinary rule, that is, by multi- 
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plying '^rows by columns/^ In fact, upon applying this rule and 
keeping (22) in mind, we readily find 


vln 

A 12 

A i:i ... 


h 

F, 

-d 21 

A 22 

-^23 ... 


h 

Fi 

A 31 

A 32 

.433 . . . 



“ F, 


7. Change of axes in Hilbert space. Transformation matrices. 

Let us now consider, in addition to the eigenfunctions yn defined by 
equation (7), another complete system of eigenfunctions defined 
by another eciuation of the same form (referring to the same 
interval and with the same boundary conditions). This new set 
will define another system of orthogonal axes, specified by the 
versors ?/^. With respect to these axes the vector f will have certain 
components /' expressed, just as for /n, by 

/n - f • y;, (30) 

and we may write, in analogy to (8'), 

f = J/X. (31) 

n 

Now let us look for the relation between the components of the 
vector f in the new and the old axes, that is, between /' and /„ 
We note that each of the versors may be specified by its com¬ 
ponents (with respect to the old axes) which we shall designate by 
Smn (the first subscript specifies the component, the second sub¬ 
script the versor with which we are dealing), so that 

y(. = ^ SmnYm- (32) 

m 

Because of (8), we have 

Smn = Yn * Ym (33) 

and the quantities Smn will obey the relations [see formula (10)]: 

^mnj (34) 

i 

which express the fact that the orthonormal. 

* It need hardly be pointed out that the prime here (and in this whole sec¬ 
tion) does not have the meaning **derivative/^ 
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With this and from (30) and (32), using (6), we obtain for the 
new components the following expression: 

m 

or, because of (8), 

m 

In order to obtain the inverse formulas, we could solve these 
for the fmy but it is more convenient to proceed in symmetrical 
fashion, that is, to consider the components of the versors y with 


respect to the axes y'. We shall indicate these components by 

putting 


= y« • y^, 

(36) 

so y^ ~ ^w^ym* 

m 

(32') 

We then find, analogously to (35), 


/n = J SUL- 

(36') 

m 

We then note that 


Sm,. = (yL • y») * = S*„, 

(37) 

SO that we may also write 


> 

II 

Xl 

3 

(35") 


m 


It is convenient to consider the Smn as elements of a matrix {©}. 
We shall say that we may pass from the components of a vector 
with respect to the 2 /-axes to its components with respect to the 
2 /'-axes by means of the matrix {0}—that is, by means of the linear 
transformation (32). Similarly, the matrix effects the passage 
from the components y' to the components y by means of the 
transformation (32'). Furthermore, (37) expresses the fact that 
the matrix is obtained from {0} by interchanging its rows 
and columns (that is, by replacing each element with the one sym¬ 
metrical with respect to the principal diagonal) and taking the 
complex conjugate of each element. If we form the product of 
these two matrices ({®}{@}) by the nile of §6, we find, for its 
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general element with indices m and n, meanwhile recalling (37) 
and (34), 



i i 


Hence we can write 

= {!}• 

Similarly, it can be proved that = 1. Hence the 

matrix {© j is the inverse of {©}: 

{^1 = (38) 

This property, equivalent to (34), characterizes matrices which are 
said to be unitary. 

The matrices introduced in this way are not considered to 
represent operators, since they are not used to pass from one vector 
to another, but instead from the components of a vector with 
respect to one system of reference to the components of the same 
vector with respect to another system of reference. They are 
therefore called transformation matrices. 

Let us suppose that after having passed from the representation y 
to the representation y' by means of the matrix {©), we pass on to 
a third (complete and orthogonal) representation of versors y", 
by means of another transformation matrix \X}. We can then 
show that it is possible to pass directly from the representation y 
to the representation y" by means of the transformation matrix 

m = 

In fact, the passage from / to /" will be expressed by the fol¬ 
lowing formula, analogous to (35): 

fi' = 2 

n 

Substituting (35) for/', we have 

fi’ = 2 nsLu = ^ (X 

m,n m n 
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fi = 


Upon comparison with (35), this operation shows that we may 
pass from the / to the /" by means of the matrix {^} in the same 
way in which the matrix (©} effects the passage from the / to 
the/'. 

8. Representation of the same operator in different systems of 
reference. We have seen that an operator 21, fixed in a system of 
axes (specified by the versors yn) in Hilbert space, is represented 
by a matrix {21} whose elements are given by (20). If one now 
considers another system of axes y', as we did in the last section, 
the problem arises of finding the matrix {21}' which represents the 
same operator in the new system of reference, that is, the matrix 
whose elements express the components of 21f (with respect 
to the axes yO in terms of the components /' of f by the formula 


F' = V 4' f' 


which corresponds to (22). 

The element will be given by the formula 

ALn = (3iy0 • (42) 

which corresponds to (23). 

Now we substitute for y' and their expressions in terms of y, 
namely [see (32)], 

yn ~ ^ ^inYiy Y m, ~ ^ ^kmYkj 


and we get, recalling (23), 




or else, because of (37), 


I 


^mkAfciSin* 


However, by the multiplication rule, this is simply the element 
(m, n) of the matrix {6} {21} {©}, or else, by (38), {®}~M2l} I®). 
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Hence we write 

[%v = (44) 

This is the law according to which the matrix {21} transforms 
in going from the y- to the ?/'-axes. 

It need hardly be mentioned that the algebraic relations between 
matrices have the same form in any system of reference. If, for 
instance, we have in the first system 

{H!=F(1931) (45) 

{F standing for an analytic function), we still have in the second 
system 

{nv^Fmv). (45') 

This is a direct consequence of the fact that both (45) and (45') 
express the same relation 21 = F(®) between the operators. Fur¬ 
thermore, it would be easy to prove this directly by verifying that 
the transformation (44) does not alter the relations for sum, product, 
and inverse, from which any analytic relation may be constructed. 

9. Hermitian operators and matrices. Of particular interest in 
quantum mechanics are the linear operators 21 possessing the 
property that for any function / the product f • 2lf is real, that is, 

f • 2If = (f • 2Jf)*. (46) 

These operators are called Hermitian. 

As an important example, let us consider the operator which occurred 
in §1 of Part II, that is, (A, B, C being real), 

and let us look for the condition which makes § Hermitian. Applying 
(46), we see that for any / we must have 

fA(rr - r"f)dx + fni/y* - r'mx = o. 

Noting that the first of the two differences in parentheses is the deriva¬ 
tive of the second one, and calculating the firvSt integral by integration by 
parts, we obtain (if / vanishes at the endpoints) 

^B){fr-rf)dx = 0, 

which requires, since / is arbitrary, that A' ~ B. Hence we find the 
condition which we have already stated by saying that the equation 
?/ = 0 is self-adjoint (see §3 of Part II). 
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An operator Si’s property of being Hermitian may be carried 
over into an important property of the matrix representing it 
(in any system of axes). In fact, it is apparent that if, as usual, 
we set F = 31/, we have, because of (10) and (22), 

f. sif = 2 J j*, (48) 

m fn,n 

and taking its complex conjugate, 

m,n 

This equation may also be written, by interchanging the summation 
indices, as 

(f .»)* = (480 

m,n 

Substituting the last expression, along with (48), into the con¬ 
dition for Hermitian matrices (46), we obtain 

AZn = (46') 

that is, the elements lying symmetrically with respect to the princi¬ 
pal diagonal are complex conjugates of each other. (In particular, 
the elements along the principal diagonal will be real.) With the 
notation explained in §7, (46') may be written {S} = {81}. Such 
a matrix is called Hermitian, It is obvious that, conversely, a 
Hermitian matrix always represents a Hermitian operator. 

From this analysis we may easily obtain another property of 
Hermitian operators: for any two functions / and g if (and only if) 
31 is Hermitian, we have 

f • 3Ig = 8lf • g. (49) 

In fact, calling gn the components of g, we have, analogously 
to (48), 

f • ag = '^fmAZng*, 

m,n 

g . Slf = ^ OmAlJ*. 

m,n 

Taking the complex conjugate of the second summation and 
interchanging the indices m and n, we recognize that, by virtue 
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of (46'), this summation is identical with the first one; that is, (49) 
is verified. 

It is apparent that if {21} is a Ilermitian matrix, so is the 
matrix {21}' which corresponds to the former in any other system 
of reference. This fact may be recognized either by means of (44) 
or by observing that if {21} is Ilermitian, the operator which it 
represents will also be Hermitian, and hence the operator will be 
represented by a Hermitian matrix even in another reference frame. 

In what follows we shall deal only with Hermitian operators 
and matrices. 

We shall now show that if 21 and 93 are linear Hermitian oper¬ 
ators, then the two linear operators 


6 = 2193 + 9321 (50) 

and 3) = 2X2193 - 9321) (50') 

are Hermitian, and in particular that if 21 and S commute, their 
product is Hermitian. 

Indeed, from (49) we have, replacing g (an arbitrary vector) 
by ®g, 

f • 3lS8g = 2lf • S8g. (51) 

We shall similarly find the formula [which may be obtained from 
(51) by interchanging 31 and SB, and f and g] 


g • ®31f = SBg • 3lf 
or else S83tf • g = 3lf • SBg. 

A comparison with (51) shows that 

f • 3ISg = 5B3lf • g. 

Similarly, interchftnging 31 and 33, 

f • 583lg = 3l33f • g. 

Adding the last two expressions and using definition (50), we obtain 

f.(Sg = 6f.g 

which expresses the fact that (5 is Hermitian. Subtracting the two 
expressions and using definition (50'), we find 

f' fJ)g = —• g; 


(510 



300 GENERAL METHODS OF QUANTUM MECHANICS [§9 

and since f • ^®g = — • 2)g because of (5'), (51') may be written 

f • ®g = S)f . g 

expressing the fact that S) is Hermitian. 

A corollary of the preceding theorem states that if ?I is Her¬ 
mitian, then all its powers are Hermitian, and hence also any 
analytic function (with real coefficients) of 21 is Hermitian. Further¬ 
more, a linear operator which is a function (with real coefficients) of 
several Hermitian and commuting linear operators is evidently itself 
Hermitian. However, if the linear operators of which it is a func¬ 
tion do not commute, it cannot be Hermitian; for example, the 
linear operator 2125 — 2321, if it is not zero, is not Hermitian, because 
otherwise 2) would not be Hermitian as given by (50'). 

10. Principal axes of an operator. Given a (Hermitian) linear 
operator 21, we ask the following question: Are there vectors 
(in Hilbert space) which the operator 21 changes in magnitude but 
not in direction? This amounts to a search for functions / such 
that 

?!/ = Af, (52) 


where A is a multiplicative constant to be determined. 

This problem, in the case where 21 is the operator 2 of (47) (with A' — B), 
consists in trying to find solutions of integrable square of the self-adjoint 
equation 

A/" + A'/' + ((7 - L)f - 0, 

where L plays the role of parameter.^' As we have seen in §2 of Part H, 
there exist an infinite number of independent solutions {eigenfunctions) 
f — yn, to each of which there corresponds a value L„ of L (eigenvalue), 
some of the Ln possibly being equal to one another. 

In general, we shall call eigenvalues of the linear operator 21 the 
numbers An, and eigenfunctions the functions Un such that 


^Ufi AnUnj (53) 

and we shall always suppose that the Un are normalized. Hence 
the latter represent as many unit vectors Un, which we may also 
suppose to be orthogonal to each other, so that we shall have 

i®The proof of this fact for a general linear operator 21 (provided it is 
Hermitian) is carried out in the same way as for the linear operator (47) in 
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Un • == hnm- (54) 

The directions of these vectors are called the principal axes of 
the linear operator 31, and any vector lying along one of these axes 
is changed by the operator 3f in magnitude only, not in direction. 

The (ugenvalues of a linear Hermitian operator are always real, 
as we shall show later on. They belong to a complete system of 
orthogonal functions and may be eitlicr continuous or discrete. We 
shall generally refer to the discrete case, it being understood that in 
all formulas having continuous eigenvalues we must replace the 
summations by appropriate integrals, and that we are to adopt as 
criterion for normalization and orthogonality the explanation in 
§10 of Part II. 

Note on incomplete operators. In a problem in which several 
independent variables x, . . . occur, a linear operator is said to 
be complete if all the variables occur in its expression, and it is said 
to be incomplete if some of them are missing (for example, if y and 
the operations of differentiation with respect to y do not occur). 
We shall generally call x the variables which occur in the linear 
operator 31, and y the variables which are missing. It is evident 
that by solving (53) we shall obtain for the Un some functions Xn{x) 
of X only, not of y. However, any one of these may be multiplied 


§5 of Part II. If Um and Un belong to two diff(jrent eigenvalues A„i, and An, they 
are orthogonal. In fact, we have 


SlUn = AnUn, 3lu,n = A 

Multiplying the first of these through by u,n from the right and the second by 
Un from the left (scalar product) and subtracting, we obtain 

(SlUn) • Un, - Un • (HUn*) = (An “ An,)Un * Un,. 

But if 31 is Hermitian, the left side is zero, and hence it follows (since 
An Am) that Un • Un, = 0, that is, that they are orthogonal. 

If An is a multiple eigenvalue to which there belong the p independent 
eigenfunctions the general eigenfunction belonging to 

this eigenvalue is of the form 

V 

Un = ^ cJUnU (55) 

the coefficients c being arbitrary. Of course for the Wn we may substitute p 
of their linear combinations, which are orthogonal to each other (this is proved 
as in §6). There is still a large degree of arbitrariness in the choice of the 
combinations. 
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by an arbitrary function fn{y) without ceasing to satisfy (53). 
Because of this fact, fn(y)xn{x) may also be considered as an eigen¬ 
function <^n(x, y) of 21, whose general expression will therefore be 

V’rtCx, y) = fn{y)xn{x), (56) 

Hence, considering the Hilbert space of the functions of x and 2 /, 
we can say that in this space the principal axes of the incomplete 
linear operator 21 are not determined (even if possible degeneracies 
are ignored) but may be chosen with a large degree of arbitrariness 
by selecting the functions fn(y) at will. In general it will be con¬ 
venient to assume the fn(y) to be orthonormal in y; thereupon 
(the Xn being assumed to be normalized in x) the functions (Pn{x, y) 
— fnXn will constitute an orthonormal system of functions (but not 
a complete system). 

This indeterminacy in the eigenfunctions of an incomplete linear 
operator may in a way be likened to a degeneracy, with the eigen¬ 
value of 21 considered as multiple of infinite order. In fact, let 
u\y) be any complete system of orthogonal axes in the function 
space of y. The fn{y) mentioned above may be expanded in the 

form fniy) = ^ and hence (56) may be written 

00 00 

<Pn{x, 2 /) = ^ f’nU’Xn = ^ y), (57) 

y -1 y 1 

where we put <^(a:, y) = u^{y)xn{x). This formula is analogous to 
(55). It expresses the most general eigenfunction of 21 belonging 
to the eigenvalue An, as a linear combination (with arbitrary coeffi¬ 
cients /O of the fundamental eigenfunctions 

For applications the following theorem is of importance: if (pn is 
an eigenfunction of 21 belonging to the eigenvalue An, it is also an 
eigenfunction of F{%) belonging to the eigenvalue F(An), where F 
stands for any analytic function (see §4). 

Proof, By hypothesis, we have 

= Anfpn- 

Operating upon both sides with the linear operator 21, we obtain 


21 Vn = A„2l^„ = AJ^n. 
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Repeating this process, we recognize that the following expression is valid 
for any power of ?I: 

3XVn = (58) 

Assume now a function F defined by the series 

F{ol) = ^ ara^j 

r 

and denote by S the linear operator F(31). Then 

^ Or3tV», 

r 

and because of (58), 

%<Pn == ^ • <Pn ~ F{An)<Pn, (59) 

T 

which proves the theorem. 

It is also evident that if the function F can be inverted (that is, if we 
may write ?l = where G stands for an analytic function), the con¬ 

verse of this theorem also holds true. In this case, then, we may say that 
the linear operators and 5 = have the same eigenvalues and the 
same eigenfunctions. 

In the case of continuous eigenvalues it is important to observe 
that if (pA' is an eigenfunction of 21 belonging to the eigenvalue A' 
and is normalized (according to the criterion of §10, Part II), it is 
also an eigenfunction of F(2I) belonging to the eigenvalue F(A') but 
is not normalized. To normalize it, it must be divided by y/dF/dA\ 
as may easily be verified. 

This theorem suggests an important generalization of the con¬ 
cept of function of a linear operator, which up to now has been 
limited to analytic functions. Let F{a) stand for a function of the 
variable a (possibly not developable in a series), and let 21 be a 
linear operator with eigenvalues An and eigenfunctions <pn- We 
shall define the linear operator F(2I) as the operator which, when 
applied to a v’n, transforms it into F(An)<^n, in accordance with (59), 
and hence when applied to any / possessing the expansion 

n 


31/ - ^ 


transforms it into 


(60) 
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The only condition to which the function F{a) is subject is that it 
be uniquely defined for all values An of a. 

11. Fundamental theorem on commuting operators. The neces¬ 
sary and sufficient condition for two linear operators SI and 25 to 
have a complete system of eigenfunctions (and hence of principal 
axes) in common, is that they commute. 

First we shall show that this condition is necessary. Let us 
suppose that there exists a complete system of orthogonal eigen¬ 
functions <pij <P 2 y . . . which are at the same time eigenfunctions 
of 31 and 93, so that we may write (numbering the subscripts of the 
eigenvalues suitably) 

= Aiipij S8(pi = 

where some of the Ai may be equal to each other (and likewise for 
the Bi)y so as to include cases of degeneracy. 

Operating upon the first equation with the linear operator 93 and 
upon the second one with 31, w^e obtain, respectively, 

m<Pi = Ai^<pi = AiBiipi, 
m<Pi - B^%^i = BiAi^i, 
and hence = 933lv?i. 

Since the ipi form a complete set, any function / may be expanded 
in a series of the and hence for any / the following expression 
holds: 

3193/ = 5831/, 

which means that 3193 = 9331. (61) 

Let us now prove that the condition is sufficient. Let us sup¬ 
pose that (61) holds and let us call (Pi a complete set of eigenfunc¬ 
tions of the operator 31, such that 

%<Pi = Ai(Piy 

and let be a complete set of eigenfunctions of 93 (all orthogonal 
to one another) for which 

93^y = 

Let us develop the in a series of the yf/ji 



§11] 


MATHEMATICAL INTRODUCTION 


305 


and apply the operator UF{^) to both sides, where F{S8) is any 
linear operator function of 33. Then we have, making use of the 
theorem of §10, 

3fA’(33)^.- = (02) 

J 

On the other hand, since commutes with 35, it also commutes 
with F(S), and hence 


= /^■^(35)3l^, = 

= Ai2 c^JFm,|^J = ^ 

3 3 

Comparing this with (62), we obtain 

^ c^jFiBjWj = T, 2 (63) 

3 3 

and since F is an arbitrary function, the coefficients F{Bj) must be 
considered to be entirely arbitrary numbers. It follows that equa¬ 
tion (63) cannot hold, unless for each term in which Cjj 9 ^ 0, 

i#; = (i = 1, 2, . . .) (63') 

We now observe, however, that for a given j at least one of the 
coefficients dj must be 9 ^ 0 . Otherwise, in fa(d, the i/'v in question 
would be orthogonal to all the <p^ (since dj — ipi • ypj)^ which is impos¬ 
sible because the ipi constitute a complete set. It follows further 
that for any \l/j there exists at least one index i for which (63') holds, 
or that each is an eigenfunction of ?f (belonging to the eigen¬ 
value Ai). Hence we have proved that the xpj make up a complete 
set of eigenfunctions which are common to 31 and 33. 

N^ote. The preceding proof also holds for the cases in which multiple 
eigenvalues are present. Let us focus our attention particularly upon the 
meaning of this theorem in this case. If, for instance, is a multiple 
eigenvalue of 3t, of order p, one may select, in an infinite number of ways, 
p orthogonal and independent eigenfunctions belonging to it (whose vectors 
form a plane manifohi V in p dimensions). In general, however, for only 
one of these choices do we obtain eigenfunctions which are common to 
both operators. (It is apparent that to the unique eigenvalue Ai there 
correspond p eigenvalues of 33; that is, there are p principal axes of 33 
within the subspace V.) This statement also applies in the case of infinite 
multiplicity, that is, when one of the operators, such as 3f, is incomplete. 
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In this case the theorem has the following implication. Let there be, for 
instance, two variables x and y, of which only x occurs in 21. We already 
know that if we call Xi{^) the eigenfunctions of 21 in the function space 
of X, it is possible to consider, as eigenfunctions of 21 in the function space 
of X and y, all the functions of the form (p^{x^ y) — f being an 

arbitrary function of y alone. However, only some of these eigenfunctions 
are common to 21 and 33; specifically, /(?/) must be chosen from among a 
discrete (or possibly continuous) aggregate of functions /*(?/), such that 
= fk(x)xiiy), where we call xpik the eigenfunctions of 35 (we specify 
these eigenfunctions explicitly by two indices, because they depend on the 
two variables x and y). To each there correspond an infinite number of 
eigenfunctions of 33. 

12. Diagonal matrices. Let us take the principal axes of 21 
(with versors u„) for our coordinate axes in Hilbert space, and let 
us look for the form the matrix takes in representing the linear 
operator 21 in reference to these axes. We shall designate this 
matrix by {21}' and shall continue to denote by {21} the matrix 
which represents 21 with respect to the general axes 

The general element of the matrix {21}' will be, according 
to (23), 

and, because of (53) and (54), 

A^J^ ~ An]^n * = An8nfn» (b4) 

In other words, the matrix {21}' will be 


{ 21 }' 


0 0 
0 ^2 0 

0 0 As 


A matrix such as this one, in which all the elements are zero 
except for those along the principal diagonal {diagonal elements) ^ is 
called a diagonal matrix. Hence we can say: A linear operator is 
represented^ with respect to its principal axes, by a diagonal matrix; the 
diagonal elements of this matrix are the eigenvalues of the operator. 

It is apparent that if the operator is Hermitian, the matrix 
will also be Hermitian, which means, for a diagonal matrix, that 
its elements are real. Hence the eigenvalues of a linear Hermitian 
operator are always real. 
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Let there be given a linear operator 21 by means of the matrix 
{21} representing it in the general axes ?/n. We want to find the 
(diagonal) matrix J2I}' which represents 21 with respect to its 
principal axes. This operation is sometimes called reduction of the 
matrix {21} to the diagonal form or diagonalization of the matrix {21}. 

We recall from §8 that {21}'is obtained from {21} by formula (44). 
We are to find the transformation matrix {©}. For this purpose 
we note that (44) may also be written 

{©){2i'} = mm, 

or else, by equating the general term (m, k) on both sides, 


2 S„rA', = 2 


but since A[k — Akhrky the first summation reduces to the one term 
in which r = A:, so that the equation becomes 


Let us keep k fixed and assign to m the successive values 1,2,...; 
we then obtain the equations 

(^11 ~ Ak)Sik + Ai2S2k + AuSzk + • • • = 0 \ 

■A21>8u. + (4 22 — Ak)S2k + .423^3*: + * * * = 0 ( , . 

■431>SiA: + Az2S2k + (-4.33 .4A:)>S3;fe + * * * = 0 ( 

These constitute a system of infinitely many linear homogeneous 
equations in the infinite number of unknowns Su, S 2 kt . . . 

In order to understand the nature of this problem, let us first 
consider the case which occurs in ordinary three-dimensional space. 
Here the system (65) reduces to a system of three linear homo¬ 
geneous equations in the three unknowns Sik, S 2 k, Szk (we could 
write three of these systems, corresponding to A; = 1, A; = 2, A; = 3), 
and possesses nonzero solutions only if 

All — Ak Ai2 Ai3 

A21 A22 — Ak A23 ~ 0. (fifi) 

A 31 A 82 A 33 — Ak 




308 


GENERAL METHODS OF QUANTUM MECHANICS 


(§12 

This equation in Ak, called secular equation (which is the same for 
all three systems), possesses, as is known from algebra, three roots 
which are always real, i4i, ^ 2 , A^. Once found, they may be 
substituted into the three systems, and wc can easily find the Smk to 
within a factor which is determined by the conditions (34). This 
operation is seen to be identical with the classical procedure for the 
determination of the principal axes of a quadric surface (whose 
coefficients for the second degree terms are the An), The Srnk 
furnish the direction cosines of these axes, and the lengths of the 
three semiaxes are given by ±l/\/ZI* 

Hence it may be seen that the problem of reducing a matrix 
to its diagonal form (or of finding the eigenvalues and eigenfunctions 
of a linear operator ?1) is the generalization, to the case of infinitely 
many dimensions, of the problem of the reduction of a quadric 
surface to its principal axes. Generalizing what is known for a 
system of n linear homogeneous equations with n unknowns to the 
case where n is infinite, we are led to the conclusion that the system 
(65) has nonzero solutions only if Ak has certain definite values 
(forming an infinite and generally discrete sequence) which are just 
the desired eigenvalues. 

13. Matrices with continuous indices. Throughout the pre¬ 
ceding sections we have assumed that the eigenfunctions which 
define the coordinate axes in Hilbert space form a discrete sequence. 
But, as we have said in §10 of Part II, when the region within 
which the differential equation is to be solved is infinite, eigenvalues 
forming continuous spectra may occur (besides possible discrete 
eigenvalues). Thus we are led to consider cases in which the indices 
occurring in the previous formulas vary continuously rather than 
taking on only integral values. Extension of the previous considera¬ 
tions to such cases contains some delicate points, for a rigorous 
treatment of which we refer to special works.We shall restrict 
ourselves here to general remarks of an intuitive nature. 

First we observe that if an is any sequence of quantities corre¬ 
sponding to the integral values of the index n, when the latter index 
becomes a continuous variable X we must consider a\ as standing 
for an ordinary function of X, which may thus also be written a(X). 
Hence, as the eigenfunctions T/n are replaced by y\ (X is a continuous 

See, for instance, in addition to the works of Dirac, the paper by E. H. 
Kennard in Zeits. f. Physik 44, 326 (1927). 
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index), every vector f in Hilbert space is represented, with respect 
to the axes yx, by the following function of X instead of by the 
components fn : 

fx = f • yx 


(component of f along the yx-axis). Similarly, all formulas of the 
theory of vectors in Hilbert space will be modified by substituting 
integrals for the summations with a single index. For example, 
(10) becomes 

f • g = 

We shall now see what becomes of the matrices in this case. 
An expression with two subscripts Amn, when m and n become two 
continuous variables ^t, X in the 
interval (a, 6), will become an 
ordinary function of two variables, 
which may be written in the usual 
manner A(/i, X) instead of A^x. 

Therefore the concept of matrix 
with continuous indices(or ^^con¬ 
tinuous matrix’’) is equivalent to 
the concept of function of two vari¬ 
ables. If we want to retain the 
same physical picture of the matrix, 
we must interpret fx and X as 
Cartesian coordinates in a plane, 
taking the axes as in Fig. 46. Then to each point of the cross- 
hatched square we may make correspond a value of A{/jlj X). We 
may say that the elements of such a matrix '^continuously fill” the 
crosshatched square (which may eventually be extended to infinity, 
becoming a "quadrant” if 6 = oo). 

All the definitions that have already been given may be extended 
to these continuous matrices. For instance, the product of the 
two matrices A(ai, X) and 5(p, a) is the matrix 



A continuous matrix is said to be Hermitian if 


(67) 


A*(m, X) = A(X, m). 
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A Hermitian operator 21 is represented, with respect to the axes 
specified by the eigenfunctions yx, by just such a matrix, whose 
general element is, in analogy to (23) and (23'), 

A(m, X) = (Slyx) • y^ = dx. 

By means of this continuous matrix, the components of the vector 
F = 2lf are obtained from the components of f by the formula, 
analogous to (22), 

Ffi — A (/i, X)/x dX. 

In order to pass from one system of axes yx to another system yx', 
we must introduce a (continuous) transformation matrix defined by 
[see (33)] 

>S(m, X) = yx' • y;,. 

By means of this matrix, we may pass from the components / of the 
vector f to the components /' of the same vector with respect to the 
new axes, by means of a formula analogous to (35): 

h'= ( 68 ) 

The inverse transformation is given by [see (35")] 

U = f '(680 

In this change of axes, by means of (44) the matrix A (A:, j) repre¬ 
senting an operator 21 with respect to the axes yx changes over into 
the matrix A'()u, X) representing the same operator in the new axes. 
Formula (44), when written explicitly, becomes the following, 
analogous to (43): 

A'(m, X) = I" // -S*(m, k)A{k, 3)S(j, X) dk dj. (69) 

In conclusion, we shall mention the case in which the equation 
defining the y has discrete eigenvalues as well as a continuous 
spectrum between a and b. Here the previous formulas are modified 
in such a way that to each integral between a and b there must be 
added a summation over the discrete eigenvalues. This change 
introduces some complication into the writing (which may actually 
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be avoided by the use of Stieltjes integrals; see No. 14 of the Bibli¬ 
ography, page 122) but adds no conceptual difficulties. 

14. The Dirac 5-function. The extension to the case of con¬ 
tinuous indices of the unit matrix introduced in §5 (that is, the 
generalization of the symbol 5,nn) leads us to introduce a symbol 5(x) 
which is rather convenient in calculations, and is called the Dirac 
b-function. It represents a function which has the following 
properties: 

b{x) =0 for X 0, (70) 


while at the point 0, b is infinite in such a way that 



(71) 


where a and h are any two limits (a < 6), containing 0 between 
them. Strictly speaking, no function possessing these properties 
exists, and hence 6(a:) is called an improper function. We may 
approximate it as closely as we like, however, by means of analytic 
functions.^^ For example, the Gaussian function hl's/ir for 
h 00 approaches the shape of b{x). Hence we may imagine the 
curve representing 5(a:) as an infinitely narrow and infinitely high 
bell-shaped curve, enclosing an area equal to unity. 

We shall generally make use of the function b(x — Xo), which 
exhibits at the point x = xo the same singularity 5(x) has Sit x = 0. 
It has the fundamental property that, if f{x) is any function (pro¬ 
vided it is bounded over the interval considered and continuous 
at Xo), then 


f{^) 5(x — Xo) dx 


0 if the interval (a, b) does not 
contain Xo, (72) 

/(xo) if the interval (a, b) does 
contain Xo. 


In fact, if the interval does not contain xo, the function b is zero 
over the entire interval, and hence the integral is zero. If the 
interval (a, b) contains xo, the integral may be limited to a segment 

The use of the improper function may be avoided by replacing it by the 
Stieltjes integral concept, as was done systematically by Neumann (see No. 13 
of the Bibliography). Its use, however, contributes to making the formulas 
simpler and more expressive, and may be considered as the abbreviated indica¬ 
tion of a passage to the limit. 
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xo'— €, xo + e (e being as small as desired). Then, calling fi a 
value lying between the lower and upper limits of / in the interval 
(xo — e, Xo + e), Ave have 

fix) 5(x — Xo) dx = fix) 5(x — Xo) dx 

Jet jxo — e 

= fi — ^o) dx = .A; 

Jxo — f 

and since this must hold for any e, we must have fi == /(xo). 

A property of 5 which we shall often use is expressed by the 
formula 

(x — Xo)5(x — Xo) = 0, (73) 

which is to be justified by saying that for all values of x for which 
the first factor does not vanish, the second factor does vanish. 
More precisely, we say that (73) is to be understood in the sense 
that the integral of the left side with respect to x, extended over 
any interval (as small as desired), is zero. 

The introduction of the improper function 5 makes possible a 
formal consideration of the axes of Hilbert space, which we have 
called continuous’’ in §1, as principal axes of a linear operator, 
that is, as a particular case of the yn-axes considered up to now. 
In fact, we consider the linear operator y = x and look for its 
principal axes, writing the equation 

X(p^'ix) = xVx'(x), (74) 

or (x — x')(px' — 0, (74') 

where we have indicated the eigenvalue by x' (since we are dealing 
with continuous eigenvalues, as we shall see). Now (74') is satisfied 
by taking any x' and 

<Px'ix) = 5(x - x'), (75) 

as is shown by (73). In addition, the eigenfunctions are ortho¬ 
normal, since (see §10 of Part II), upon calling Aix', A 2 X' two 
infinitesimal intervals, we have, as we easily see by using (72), 

f dx f Six — x') dx' f 5(x — x') dx' 

J — to jAix' jAix' 

{ 0 if Aix', A 2 X' have no point in common . 

Ax if Aix', A 2 X' have a segment Ax in common. ^ 
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Therefore: the eigenvalues of the operator x are all the real 
numbers x', and to each of these there corresponds an axis specified 
by the function (75). These axes are actually the continuous 
axes^' introduced in §1. In fact, the projection of any function/(x) 
upon the axis corresponding to the eigenvalue x' is 

= f . h{x - x') = jKx)d{x - x') dx = f{x'), (77) 

From the formal standpoint, it is apparent that the symbol 8nm 
which is often used {8nm = 0 if n m, 8nm = 1 if n = m) has its 
analogue in the function 6(X — /x). For instance, the condition for 
orthogonality and for normalization of the eigenfunctions, expressed 
by (46) and (46') of (^hapter 5 may be written symbolically as 

fyyy*y^dx = 5(X" - X'). (78) 

Indeed, if (78) is satisfied, we have (calling AX', AX" two infini¬ 
tesimal sections of the continuous eigenvalue spectrum) 

/-"«L y^' L" y*^" = L L- y^'y*^" 

= f dX' f dX" 6(X" - X'). 

yAx' yAx" ^ ^ 

The integral with respect to X" is 1 if X' lies within the small 
interval AX"; otherwise it is zero. Therefore, if we call AX the 
segment (possibly zero) common to the two intervals AX', AX", the 
last expression reduces to 

f dX', 
yAx ^ 

that is, to AX. Thus again we find the conditions for orthogonality 
and normalization introduced in §10 of Part II, 



CHAPTER 11 


General Theory of Quantum Mechanics 

16. The concept of ‘‘observable.^’ We shall call observation (or 
measurement) a series of physical operations whose result is expressi¬ 
ble by means of a number (we shall include in the operations of 
measurement the eventual mathematical operations to be per¬ 
formed upon the direct experimental results). An observation will 
be defined when these operations are described and when the instant 
in which they are performed is specified; for example, one of the 
experimental arrangements of §23, Part II, with a specification of 
the time t at which the photograph is to be taken, constitutes an 
observation. 

In ordinary mechanics, an observation serves to find the value 
possessed by a certain ‘^mechanical quantity’^ at a certain instant, 
that is, a coordinate, a momentum, or any function g{q^ p) of the 
coordinates and momenta, such as, for instance, the angular 
momentum or the energy. This value, however, is thought of as 
existing even if we do not perform the measurement which is 
designed to find it. In other words, g is thought of as a variable 
which takes on different numerical values in succession during the 
motion of the system; these values may be calculated a priori, even 
without observation, by means of the laws of mechanics, if, for 
example, the initial coordinates and velocities of the system are 
known. 

In quantum mechanics, as has been noted in connection with 
the coordinates of a particle, we must adopt a profoundly different 
point of view. The numerical value of a physical quantity has no 
meaning whatsoever until an observation upon it is performed, and 
the result of that observation may in general only be predicted 
probabilistically. In other words, it is maintained that the result 
of the observation may be any one of certain numbers^ Gi, (j 2 , . . . 

1 For simplicity of writing, the case considered here is the one with discrete 
eigenvalues, but the G may also constitute a continuous system, for instance in 
the measurement of a coordinate. 
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with respective probabilities^ Pi, P 2 , . . . ; the object of quantum 
mechanics is just the determination of these possible results and 
their respective probabilities. Examples of this method were seen 
in connection with the observation of a coordinate or a momentum 
(§25 of Part II), or of the energy (§29 of Part II). 

In quantum mechanics, then, entities such as the coordinates, 
the components of momentum, the energy, and so on (at a given 
instant), are considered from a point of view different from that of 
ordinary mechanics; that is, they are entities defined only by the 
type of ‘^observation’^ which corresponds to them. In order to 
make this condition evident, Dirac has proposed the name observables 
for these quantities. (We recall that in the concept of observation^ 
and hence also in the concept of observable^ there is implicit also 
the instant at which the measurement is performed; for example, 
the abscissa of a particle at the time t — 7 sec is an observable,^ and 
the abscissa of the same particle at the time i = 8 sec is another 
observable). 

In ordinary mechanics, the concept of observable is synonymous 
with that of “numerical value of a variable.” In quantum mechan¬ 
ics, however, an observable does not in general have a numerical 
value, but there correspond to it an infinity of numerical values 
(continuous or discrete) (?i, G 2 , . . . with respective probabilities 
Pi, P 2 , . . . . Only in special cases may it happen that these 
probabilities are all zero except one, for example, Pi (which is 
therefore = 1), and only then can we say that the observable G has 
the value Gi, For instance, this case is realized, for the energy, 
when the system is in one of the states which w^e have called 
“stationary” or “states with definite energy” (§27 of Part II), 
that is, when yj/ is an eigenfunction of the Schrodinger equation. 
However, when ^ is a linear combination of eigenfunctions (§29 of 
Part II), the energy has no definite numerical value. A direct 
observation designed to determine it may yield for result any one 
of the eigenvalues Pi, P 2 , . . . wdth the respective probabilities 
cicf, C2ct .... 

* The concept of probability must be understood in the manner explained 
in the footnote on page 145. 

‘Strictly spealdng, we should specify the measuring arrangement; for 
example, the pinhole camera described in §23 of Part II (first method) defines 
an observable which we may call abscissa; it is a generally accepted postulate 
that the second and third methods define the same observable. 
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One more example: the coordinate x o^ a, particle (at a given time) 
is in general an observable which has no numerical value, since by 
observing it we may find all the values from — oo to + with a 
probability density P(x) — The curve representing /^(:r) may 

have any form whatever, some examples of which were seen in §36 
of Part IL In particular, P{x) may be zero everywhere, except for 
a definite .r, and only in this case does the observable x have a 
definite value. 

16. Compatibility of observables. Let us suppose that the 
measurement of an observable A has yielded a certain result A\ 
and let us measure, immediately following A, another observable 
obtaining for example. By this measurement we put the system 
in a new state in which B has the definite value 7i', but A in general 
no longer will have a definite value (that is, if immediately after B 
we returned to an observation of Aj wo could no longer be certain 
of finding A'). There are, however, certain cases in which this does 
not hold true—those in which we may observe B right after A, 
without A ceasing to have the value resulting from the previous 
measurement. The two measurements are then said to be com¬ 
patible, In this case, just after the observation of A has yielded 
the result A' and the observation of B has given B', the system 
finds itself in a state such that both A and B have definite values, 
namely. A' and B' respectively. 

As may be seen, the preceding definition applies only to simul¬ 
taneous observations. The question therefore arises: does it make 
any difference whether we perform the observation of A first and 
the observation of B immediately afterward, or the other way 
around? We shall see in what follows that the laws of quantum 
mechanics require the former choice, which we shall assume for now 
without further remarks. 

Two observables are said to be compatible if their observations are 
always compatible, no matter what their results. This is one of 
the most important concepts of quantum mechanics. An example 
of two compatible observables is given by two coordinates of a 
particle (for example, x and y) at the same instant.'* On the other 
hand, a coordinate x and the conjugate momentum p* are not 
compatible, as was seen in §22 of Part II. 

* At least if the processes of measurement by which these observables are 
defined satisfy certain conditions (see §24). 
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Hence we shall say that several observables A, B, C, . . . are 
compatible if each of them is compatible with all the others. 

17. Observables which are functions of other observables. 
Given an observable X and a function/(X), the observable G — f{X) 
is defined by the following set of operations. First we perform the 
observation X, and upon the result Xr we perform the mathematical 
operations indicated by the function /. By definition, the number 
Gr = /(Xr) is the result of the measurement of th(^ observable G. 
We may also say that G is measured with the same instrument 
which measures except for the substitution (for the scale cali¬ 
brated in terms of the numbers Xr) of another scale on which the 
numbers /(Xr) are marked in the same positions. 

It is clear that if we suppose / to be single-valued and if the Gr 
are discrete, their probabilities will be equal to those of the corre¬ 
sponding quantities A^r- If, however, A" has a continuous spectrum 
of values A’”', with probability densities P(A"'), the values G' of the 
observable G form a continuous spectrum with probability density 


Q{G') 


dUdT 


It follows readily from the definition that an observable X is 
compatible with any/(A"), 

We now proceed to the definition of a function of several observa¬ 
bles X, F, Z, . . . (referring to the same instant). If A", F, Z are 
compatible with each other, the procedure for the definition of 
F(A'', Y^ Z, . . .) is the immediate extension of the procedure for 
the case of a single variable: a simultaneous measurement of X, F, 
Z, . . . , yielding numerical results Xr, Fr, Zr, . . . , followed by 
the mathematical operations expressed by the function F(Xr^ Fr, 

It is then quite evident that the observable F defined in this 
manner is compatible with each of the observables X, F, Z, . . . . 

However, in the case that X, F, Z, . . . are not compatible 
with each other, this procedure is evidently no longer applicable. 
Nevertheless, given a function of several variables F(x, y, 2 , . . .), 
we can define (at least under rather broad conditions) an observable 
F(X, F, Z, . . .) in the following manner. 

Let us first of all consider the case of the sum of several observa¬ 
bles X, F, Z, . . . which are not compatible. Given an observa- 
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ble G (defined by any procedure) we shall give the following criterion 
for deciding whether or not G is equal to the sum X+F + 
Z + • • • . We examine a large number of copies of the system 
under consideration (all equally constituted but possibly in different 
states); then we select a very large number of these at random and 
perform a measurement of X upon each system of this group. 
Let X be the average of the results obtained. We then select 
another very large group at random, in which we measure Y and 
compute y, and so forth; finally we take one last, large group in 
which we determine G and calculate (?. If it now happens (for any 
initial state of the ensemble of systems) that 

G = X + y + z + • • • , 

we shall say that G is the sum of Y, ^ and wc shall write^ 

G=X+Y+Z+--. 

It is apparent that this definition, when X, F, Z, . . . are com¬ 
patible with each other, does not conflict with the definition given 
above for the function of several compatible observables. In fact, 
this definition evidently preserves all the ordinary properties of 
this operation. 

From the definition of the sum vre may pass on to the definition 
of ‘^symmetrized product,that is, of J^(Xy + FX). Indeed, 
supposing that there exists an observable G such that G = X + F, 
the formal properties of algebra are preserved if we can write 
G 2 == + F2 + XF + FX. Since X^, Y\ G^ represent observa¬ 

bles which have already been defined, this reasoning leads us to 
introduce the following definition for the symmetrized product: 

yaXY + FX) = - F2); 

^ This definition does not permit us to construct the procedure of measure¬ 
ment for G if we know those of X, F, Z, . . . . Hence there remains the 
question whether there exists in every case an observable G which has the indi¬ 
cated property. However, in the cases which occur in practice, this condition 
may actually be verified, and we shall postulate that it is always verifiable. 
For example, the kinetic cmergy of an electron is defined by X =* (l/2m) 
(p*2 4- Pt/^4- p»^) and hence is obtainable from a measurement of momentum; 
the potential energy is defined by F = eV{x^ y, z) and can be obtained from 
a measurement of position. They are incompatible observables, but the total 
energy E (definable in reference to spectroscopic terms) has the property 
jS « 4- F, and therefore we may write « X 4* F. 
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that is, we shall say that an observable F is equal to + YX) 

if it is equal (in the sense specified al:)ove) to *- — F^). 

If the observables X and F are compatible, their symmetrized 
product is identified with the product XF or FX. However, if 
they are incompatible, no meaning can be given to the expressions 
XF or FX, but only to 3^^(XF + FX). By an analogous pro¬ 
cedure, the symmetrized products for any desired number of factors 
may be defined. 

Now given a function of several variables F{Xy z, , , .) 
developable in a power series, we may always write each term of the 
series in symmetrized form and therefore may attach a meaning 
to the form which is obtained by replacing x, y, z, and so on, by as 
many observables (even if not compatible) X, F, Z, . . . . In 
this way the meaning of F{X^ , , .) is defined. 

18. Maximum observations. State of a system. In classical 
mechanics, when the positions and velocities of all the points of a 
system at a given instant have been assigned, its ‘‘state'' at the 
instant considered has been completely defined, meaning that any 
further condition (for example, the assignment of a given value to 
the energy) would be either incompatible with the preceding ones 
or else automatically satisfied. 

We shall now see how this notion may be carried over into 
quantum mechanics. If, for example, we assign values to the 
coordinates of a system of particles at a given instant, we have 
already reached a complete description of the state of the system, 
in the sense explained above. In fact, any further condition would 
be either automatically satisfied or incompatible with the others. 
(Thus, if we were also to assign the value of a component of velocity, 
this condition would be incompatible with the preceding ones by 
virtue of the uncertainty principle.) Hence we shall say that a 
measurement of all the coordinates of the particles constitutes a 
“complete set of observations,” or a “maximum observation,” 
because it furnishes, so to speak, a maximum amount of information 
obtainable from measurements performed on the system. Instead 
of the coordinates, we could measure all the components of momen¬ 
tum, obtaining in this way another complete set of observations. 
In general, we shall say that a set of observations is complete, or that 
it constitutes a maximum observation, if there exists no other observa¬ 
tion which is independent and compatible with the set considered. 
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The results of a maximum observation define completely the 
state of the system (which is supposed to be isolated). Thus the 
quantum mechanical notion of stateis precisely describe^d. It is 
to be noted that the state defined in this manner does not refer to 
a particular instant but rather to the entire time interval in which 
the system remains isolated. During that time, of course, the 
system evolves according to a well-determined process characteristic 
of the state in question. Thus, for examp)le, if the three coordinates 
of a particle have been measured at the time t = 0, the state of 
the particle for all later times (until it interacts with other systems, 
possibly with a radiation field) remains the same, and is defined 
by the fact that for / = 0 the coordinates had those given values. 
(Naturally, this does not mean that the coordinates retain these 
values; rather, they suddenly cease to have definite values.) To 
each state, there corresponds a certain function ^ which charac¬ 
terizes it, as we shall see later when we generalize what has been 
said for a single particle. 

If one of the observations serving to define the state is a measure¬ 
ment of energy, we are dealing with one of the states which we have 
called (§27 of Part II) stationary states^^ or quantum states^ to which 
there correspond the eigenfunctions of the Schrodinger equation. 

All that has been said so far applies to isolated systems. In 
many cases, however, the notion of state may be extended also to 
systems under external influences, where the state of the system at 
a given instant t is defined as the state in which the system would 
be left if the external action were to cease at the time t. It is to 
be understood that in general the state of a system which is not 
isolated will vary in time. 

Returning now to the case of an isolated system, we note that 
in order to perform any observation upon it, we must necessarily 
make it interact with another system (that is, with the measuring 
apparatus) and hence we will interrupt its isolation at least momen¬ 
tarily. Therefore, an ohservation performed upon a system generally 
alters its state. However, there are cases in which there is no 
alteration; for instance, if the system is in a stationary state we 

® This designation, the reason for which we shall see in §24, is not to suggest 
that these states are the only ones which do not vary in time. For instance, 
if we superimpose two stationary states by taking for ^ a linear combination of 
two Schrodinger eigenfunctions (see §29 of Part II), we get a nonatationary 
state which is nevertheless invariant in time. 
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may measure its energy (again finding the initial value, of course) 
without altering its state. 

Let us compare the quantum-mechanical notions of maximum 
observation'^ and state" with the corresponding classical con¬ 
cepts. We observe first of all that in classical mechanics, when we, 
know the state of the system at time ^ = 0 (in the sense specified 
at the beginning of this section), we can calculate the value of any 
mechanical quantity relative to the system. Hence, for example 
a knowledge of the coordinates and velocities at time ^ = 0 is 
equivalent to a knowledge of all mechanical variables at the time 
^ = 0 (and also at any other instant). In quantum mechanics, on 
the other hand, when a maximum observation has been performed, 
generally we may not calculate the result of a further ol)servation 
(even if it is performed at the same instant) but may only assign 
the possible results and their relative probabilities (by a method 
whose general form will be explained in this chapter). Therefore a 
complete knowledge of the state" of the system does not imply 
any knowledge of the aggregate of all quantities relative to the 
system (or else we may not attribute to this aggregate any physical 
significance). In general there are only a few observables which 
have definite values, whereas only probabilistic indications may be 
given concerning the others. 

Another essential difference between classical and quantum 
mechanics resides in the following. If in classical mechanics we 
were to substitute for the group of values of the coordinates and 
velocities another group of as many of their independent functions, 
we should obtain a representation of the state which would be 
equivalent to the other in all respects. In quantum mechanics, 
however, there are an infinite number of nonequivalent ways of 
giving a complete description of the state of the system, depending 
on which complete set of observations is selected. For example, to 
assign the values of the coordinates or to assign the values of the 
momenta are two equally complete ways of defining the state of 
the system. They are not equivalent, however, since the observa¬ 
bles having a definite value in the first case do not have a definite 
value in the second case, and vice versa. In this connection, see 
also §32. 

19. Interpretation of the Schrddinger method in Hilbert space. 

We shall now briefly recapitulate the procedure of the wave mechan- 
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ics of Schrodinger, stating it in the geometrical terminology of 
Hilbert, space. In this manner the route to many important 
generalizations will be perceived. 

We start out by noticing a formal analogy between the Schro¬ 
dinger equation for stationary states, which we write in the form 



and the kinetic energy theorem of classical mechanics, which we 
write 

^ (Pl +Pl+Pl)+U ^ E. (80) 

The analogy consists in the following: if in (80) we replace bodily 
the variables 

Px, Pvy Pz, 

respectively by the (liermitian) operators 

— ^ ^ vi— _ h d 

* 2^1 dx ^ 2wi dy ^ 27rz dz 

the left side is changed into the operator which is applied to in 
the left-hand member of (79). 

Recalling from the footnote on page 234, that the expression 
for the energy as a function of the q and the p was generally indi¬ 
cated by 5C(g, p) and called the Hamiltonian^ we shall say that the 
Schrodinger equation is obtained by transforming the Hamiltonian 
5C into an operator § (which we shall call Hamiltonian operator) by 
means of the substitution {S) and writing 

= Enypn] (81) 

that is (see §10), we are trying to find the eigenvalues and eigen¬ 
functions of this operator (which turns out to be Hermitian). Its 
eigenvalues En represent the possible values of the energy, and its 
eigenfunctions which may be interpreted as orthogonal unit 
vectors directed along the principal axes of the operator have 
the following physical interpretation. When the system is in the 
nth stationary state and we perform an observation of the coordi¬ 
nates, the probability density of finding the values x, z is given 
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by 2 /, 2 ) 1 ^; in geometrical terms, by the square of the modulus 
of the projection of the vector^ \pn upon the axis (of the system of 
continuous axes; see §2) specified by the values x, z. For brevity 
we shall say that the ^^probability amplitudefor the values x, y, z 
is equal to this projection. 

When the system is not in a stationary state, that is, when ^ has 
the general form ^ ^ Cn^n, or when the vector ^ is not directed 

n 

along one of the principal axes of the operator the same meaning 
still holds for the components of that vector along the continuous 
axes. Furthermore, if we perform a measurement of energy, the 
probability of finding the value En is |cn|^ (see §29 of Part II); that 
is, the amplitude of this probability is Cn, which is the projection 
of the vector xf/ upon the nth principal axis of the operator §. 

Hence the vector ^ determines the probability of the results of 
any measurement of the coordinates or of the energy (and also, 
as we shall see later, of any other observation carried out on the 
system). The energy is generally a function of time, and its evolu¬ 
tion in time is governed by the time-dependent Schrodinger equa¬ 
tion (see §30 of Part II), which may now be written in the form 


This equation may be considered as obtained from the classical 
equation for the conservation of energy (§ = £*), where we perform, 
in addition to the substitution (>S), the substitution of 


E by 


h d\f^ 
2Tri dt 


(S') 


and operate upon xp with the operators thus obtained. 

The vector p, considered as a function of time, characterizes 
the state of the system. It will henceforth be called state vector. 

20. Extension to a system of N distinct particles. The preceding 
section’s formulation of the Schrodinger problem for a single 


* For simplicity, we shall use the same letter to indicate a function ^ and 
the corresponding vector in Hilbert space (rather than using a boldface symbol 
for the latter, as in the preceding chapter). 
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particle suggests, by an obvious generalization, a method for treat¬ 
ing the more general problem of any number of distinct^ particles. 

Let N be the number of particles; their SiV = / coordinates will 
sometimes be indicated by .ri, X 2 , 2 / 2 , , y Zn, and some¬ 

times, when it is convenient, by r/i, ^ 2 , . . . Qf. We must now 
introduce (generalizing the criterion for the procedure followed in 
§25 of Part II) the probability that a simultaneous observation of 
all the particles at the time t yields values for the coordinates lying 
between gi, go, • • • Q/ and qi + d( 2 \, + dqoy . . . q; + dq/y 

respectively. This probability will be denoted b}^ 


P(<Ih ^ 2 , . . . g/, t)dqi dq 2 . . . dq/. 


If, on the other hand, we carry out an obs(u*vation on the Ath 
particle only, without considering the others, the probability of 
finding it in the element of volume defined by Xk, yk, Zk, Xk + dxk, 
Vk + dijky Zk + dzk will be 

Pk{xky Vk, Zky t)dxk dijk dZky 

where Vk will be obtained from P by integrating the latter with 
respect to all coordinates except Xky Vky Zky and over all values which 
these coordinates may assume. 

It is evident that the N functions Pk are defined once P is known, but 
in general P is not deterinined from a knowledge of the Pk. This state¬ 
ment may be understood, apart from anal 3 rtical means, by the following 
intuitive consideration. Let there be a box with two equal compart¬ 
ments A and By into which two small balls a and h are thrown at random, 
the balls being independent of each other. If we observe only one of these, 
the probability of finding it in either A or B is If we observe both of 
them, the following four cases may occur: 


a in Ay h in B 

probability 

a in Ay b A 

“ H 

ain By b B 

“ /-i 

a in By b “A 

“ y*. 


Let us now consider that the two balls are so large that only one can enter 
each compartment. Then there is still a probability of H for each to be 
found in either A or B; but for both of them taken together, the proba¬ 
bilities of the preceding table become, respectively, 0, 0, This 

*Here ^‘distinct^^ is to mean that we suppose each particle to have its 
own individuality; that is, we may distinguish it from the others. If we were 
dealing with identical particles (electrons, for example), other arguments would 
have to be made; these will be postponed until Chapter 15. 
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result shows that the knowledge of the relative probabilities of the separate 
observations of each j)article is not sufficient to determine the relative 
probabilities of the observations of botli of them together. In fact, it is 
also necessary to know how the position of one particle influences the 
I)robability of the position of tlie other. If this infliien(;e is zero (as in the 
first case), we say that the two balls are “statistically independent.’^ 

Returning to the case of N elementary particles, we shall say that the 
latter are statistically independent if R(^i, ^ 2 , . . • g/, i) has the following 
form:® 


P = PlCXl, ?/l, Zi, t)P2(X2, 2 / 2 , 22, 0 • • • Pn(xn, Vn, ZNy t). (83) 

This is the only case in which P is determined from a knowledge of the Pk. 
It is to be noted that the particles cannot be statistically independent 
even if no forces exist between them. This is a very important point, to 
which we shall return later on. 

In analogy to Avhat was done for a single particle, we introduce 
a (complex) function 

^ 2 , . . . q/y t) 


such that P(qi, ^ 2 , . . . g/, t) = |^|^ 

and impose upon \l/ the condition to satisfy an equation which is 
to be the generalization of the Schrodinger equation for a single 
particle. This generalization will be based on the analogy men¬ 
tioned in §19. For this purpose we start from the classical expres¬ 
sion for the Hamiltonian of a system of N particles in Cartesian 
coordinates. Let us suppose that forces^^ act upon them which are 
derivable from a potential [/(gi, q^, • • • qi), and let us indicate by 
V^x\ V^\ the momenta conjugate to Xk, yk, Zk (or by pj the 
momentum conjugate to qj). The mass of the A:th particle will be 
denoted by 7n^^\ and its charge by The Hamiltonian of the 
system may then be written 

N 

® It may be verified immediately that by integrating P with respect to 
all variables, except x*, ykt Zkf over all their ranges of variation, we obtain 

Pk(Xkf yk} 

We neglect the magnetic interactions between the particles of the system, 
which are intimately connected with the relativistic corrections which wiU be 
introduced in Chapter 14. 
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or else, in a more convenient notation, 

^ ^ 2~. Pi + 92 • • • SsJ'), (84) 

where mi = m 2 = m 3 stand for ni 4 = ms = me stand for 
and so on. 

Let us transform this expression into an operator, as was done 
in the case of a single particle, by replacing 


P by Pi = 


whereupon we have 


A A. 

2Tri dq/ 




or else, indicating by 


_ 

dxl + dyl + dzl 


Qf) 


the Laplacian operator for the /bth particle, 

N 

^=" 2 • ’ 


Qf)- 


(85) 


( 86 ) 


Let us now assume that the ^ of the system satisfies the following 
equation, which is a generalization of the time-dependent Schro- 
dinger equation [see (136) of Part II]: 


or 




h d\[/ 

^iW 


N 



Airi 

'~h~dt 


(87) 

(87') 


The stationary states or states with definite energy will be 
those for which yp is an eigenfunction of the operator that is, 
such that^^ 

^ypn — Enpn- 

Here, of course, the index n stands for a group of / indices. 


(88) 
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The time dependence of these ypn is obtained by comparing (88) 
with (87), which yields 


dyj/n _ 27rt 
“ “ 

and hence = Un{qj] 

where Uniqj) satisfies the equation 


1 En^u 


e * , 

(89) 

'nMn, 

(880 


as in the case of a single particle. 

The most general rp may of course be developed In a series of 
the ypn of (89); that is, any state of the system may be considered 
as a superposition of stationary states. 

It is interesting to verify that equation (87), which was adopted 
by induction, is satisfied, in particular, in the case of N dynamically 
independent particles—those which do not exert forces on each other 
but are subject only to external forces. In this case, U evidently 
breaks up into a sum of N functions Ukj each of which contains 
only the coordinates of one particle, and hence the operator (86) 
may also be split into a sum of N operators each operating on 
the coordinates of only one particle: 


Hence (87) becomes 




Sw^mk 


“f" Uk» 


N 


y ~ 


h dip 
2'wi dt 


(87") 


Now we can see immediately that this equation can be satisfied by 
taking 

yP = ^( 1 )^( 2 ) . . . ( 90 ) 


where ip^^^ is a function of Xk, yk, Zk, t only and satisfies the equation 


= - 


h dip^^^ 


which is the ordinary Schrodinger equation referring to the kth 
particle. For the ip of the system, then, we may take the product 
of the ip of the separate particles. In this case, (83) is evidently 
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satisfied; that is, the particles are statistically independent. If each 
of the particles is in a stationary state (with index n^), that is, if 


we may verify immediately that which is the product of all the 
satisfies the equation 


with 


k 




njnj. . .ns 


N 



this amounts to saying that the system is in a stationary state, 
and that its energy is the sum of the energies of the individual 
particles. 

Hence the splitting of the Hamiltonian into a sum of N terms, 
each of which depends on only the coordinates of a single particle^ 
entails the possibility of splitting xp into the product of N factors, 
each corresponding to one particle, and of breaking up every eigen¬ 
value into the sum of N terms, each representing the energy of one 
particle. 

But it is to be noted that (87") (established under the single 
hypothesis that the particles are dynamically independent) admits, 
in addition to solutions of the type (90), any linear combination of 
such solutions. These solutions represent the cases in which the 
particles are not statistically independent, although they are 
dynamically independent. 

It is easily recognized that if the particles are statistically inde¬ 
pendent at a given instant ^o, they are also statistically independent 
at any other instant (provided, of course, that no forces act between 
them). In fact, because of (87"), ^ is uniquely determined by its 
values for ^ in all space; hence if solution (90) holds for t = to, 
it holds also for any other time L 

It is to be noted that whereas in the case of a single particle xp 
represents fictitious waves in ordinary space, usually in the case 
of N particles we cannot retain this picture, since xp contains SN 
coordinates (in addition to t). Therefore xp may be interpreted 
only by means of waves in a 3Ai^-dimensional space. 
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21. The two-body problem. As an application, let us consider 
the problem which arises in the treatment of hydrogenlike systems 
when we wish to take into account the fact that the nucleus is not 
fixed, as was assumed in §48 of Part II. This problem, which has 
been treated in §58 of Part II from the viewpoint of the Bohr- 
Sommerfeld theory, corresponds to the well-known two-body prob¬ 
lem in classical mechanics. 

Let Xij i/i, Zi be the coordinates of the nucleus and X 2 y 22 the 
coordinates of the electron (with respect to any fixed axes), and 
let pu, Viyy pizy P 2 xy P 2 yy P 2 z be the respective momenta conjugate 
to these coordinates. Let M be the mass of the nucleus and m the 
mass of the electron. The classical Hamiltonian is 


5 C = 


^ (pL + v\v + Vu) + ^ (pL + Viv + pL) + V, (91) 


where U is the potential energy, which, since it depends only on 
the relative position, will be a function of {x^ — Xi), ( 2/2 — 2 / 1 )? 
(22 — Zi ), 

Therefore yp{xiy 2 / 1 , Ziy x^y 2 / 2 , 22 , t) must satisfy the equation 




27rf dt 


(92) 


with 

__ 4- A -i_ A^ /a 4- a 4. A\ 

^ 87 rW dy\ dz\) dyl dzy 

+ V{X 2 - Xi, 2/2 - yiy Z 2 - Zi), (93) 


Let us express this operator, not in terms of the variables Xi, pi, 
Z\y X 2 y 2 / 2 , Z 2 y but by using the three center-of-mass coordinates 


Mxi + mx2 _ Myi + my2 ^ _ Mzi + mz2 

if + w’ M +m ’ ^ ~ M + m ’ 


and the three coordinates of the electron with respect to the nucleus 


X = X2 — xiy y = y2 — yiy 2 = 22 zi. 
Evidently we have 

M d d 

dxi dxi df dxi dx M m dx 
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and hence 


dxj \dxi/ \M + mj dx^ M + m dx 

Similarly, we find 


A = (AX = / ^ Y I A 4 - 2m 

dxg \dx2/ \M + m/ dx^ M m dx 


By substituting into (93), we see that the terms with mixed deriva¬ 
tives cancel, and similarly for the other coordinates, so that we 
are left with 


§ = - 


d 2 <92 

+ m) 

^A(L + lX(A 

Stt^ \m il// \da;2 


) 



+ U{x, y, z). 


Since U does not contain i/, f, the operator may be split into a 
sum of two parts, one of which, $o, contains only 77 , f, the other 
of which, contains only x,y,z: 

d'‘\ 

Sw^iM + m) sr,i + 5 ^ 2 ^’ 

(S +If-+ + 



where we have put 


that is, m' 


± - i j- 1 . 

m' m m \ 

mM _ m 
ilf + m 1 + (m/Af) 


(m' is called the reduced mass.) Correspondingly, ^ may be split 
(see § 20 ) into the product 

i = x(f, V, f)’5'(ar, y, z), 

where the two functions x and satisfy the equations 


^oX = • 

_^ £x 

27rf dt’ 

(94) 

m = ■ 

27rf dt 

(96) 
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As may be seen, the operator is identical with the operator which 
occurs in the problem of the motion of an electron about a sup¬ 
posedly fixed nucleus, except for the substitution, for the mass m, of 
the (slightly smaller) mass m'. Hence ^ will be identical with the ^ 
of the theory developed in §4G of Part II, provided that we replace 
the mass m by the reduced mass m'. (This is the same modification 
which must be made in classical mechanics or in the Bohr-Sommer- 
feld theory in order to account for the motion of the nucleus; see 
§58 of Part II.) 

As far as the operator §o is concerned, it is the one which arises 
in the motion of a particle of mass M + m not subject to forces. 
X is therefore an eigenfunction of the free point particle (see §44 of 
Part II). This means that the motion of the center of mass may 
be treated, in wave mechanics as well, as the motion of a mass point 
of mass equal to the total mass of the system. 

As for the energy levels, they turn out to be (see the preceding 
section) the sum of an eigenvalue of (94) and an eigenvalue of (95). 
This result signifies that the energy of the atom in a stationary 
state may be divided into an energy of translation and an internal 
energy, just as in ordinary mechanics. For the latter energy (which 
alone is of interest in spectroscopy), then, the eigenvalues found 
in §48 of Part II must hold, with the slight correction introduced 
by replacing the mass m by the reduced mass m'. 

22. Fundamental principle of quantum mechanics. Corre¬ 
spondence between observables and operators. In the procedure 
outlined in §§19 and 20, the observable “energy’^ occupies a privi¬ 
leged position. We may try to generalize these considerations by 
assigning to some other general observable G the role held by the 
energy up until now. The generalization will be made in analogy 
to the previous case, except that we are going to verify its validity 
by comparing with experiment the consequences deduced from that 
case. 

For this purpose we shall assume the postulate that to each 
observable G we may make correspond a linear Hermitian operator 
@ which has the following properties: its eigenvalues Gr represent^^ 

This result makes it legitimate to apply the Schrodinger equation to the 
over-all motion of a complex system such as an atom or a molecule. The 
experiments on the diffraction of H, H 2 , and He, confirm this result (see §29 
of Part I). 

These eigenvalues always turn out real, since the operator is Hermitian. 
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the possible results of a measurement of G, ami its principal axes, 
specified by the eigenfunctions <pr of the equation 

= Gr<Pr, (9G) 

have the property that the projection of the state vector xp upon the rth 
principal axis yields {supposing that Gr is not a multiple eigenvalue) 
the probability amplitude of obtaining the result Gr in a measurement 
of G. This is equivalent to saying that if we call this component 
xpGr-, that is, if we set 

— "p ' <Pr — dS 

(where the <pr are assumed to be normalized), the probability of 
the value Gr is \\prrX'. (Note that since the vector \p is of unit 

magnitude, we have ^ \^pGr[^ = 1, as it must, in order that the above-* 

r 

mentioned significance of probability may be attributed to 
the \\pGr\^f However, if the operator has a continuous eigenvalue 
spectrum, we shall indicate a general one of these continuous eigen¬ 
values by G' rather than by Gr, and the corresponding eigenfunction 
by <pg' (normalized according to the criterion of §10 of Part II). 
Then | <Pg'\^ dG' will be the probability that an observation of G will 
give a result lying between G' and G' + dG', In general, for the 
sake of convenience of writing, we shall refer here to the case of 
discrete eigenvalues, it being understood that obvious modifications 
are to be made in the formulas for the case of continuous eigenvalues. 

If, in particular, the state vector lies along one of the principal 
axes of ® (that is, if ^ = (pr), the system is in a state such that a 
measurement of G yields the value Gr with certainty. This happens 
for the energy, when the state is a stationary state. 

Note on cases with degeneracy. If Gr is an eigenvalue of multiplicity p 
which has p corresponding orthogonal eigenfunctions (pi, (pi, .. . <pf, 
where xpi (i — 1, 2, , . . p) are called the projections of xp upon these axes, 
we must take 

p 

= 2 I'I'll" (97) 

for the probability of the value Gr. (We note that this equation is invariant 
with respect to any orthogonal transformation of the (pi.) We may justify 
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this step by considering the degenerate operator 0 as a limit of a non¬ 
degenerate operator putting, for instam^e, $ == 0 + (where is 
chosen so as not to make degenerate, and where e is a quantity which 
will be made to approach zero). Now to the eigenvalue Gr of 0 there 
correspond p eigenvalues H\ (i — 1,2, • • * v) of which coincide withf/V 
when e goes to zero, and to these eigenvalues there C()rres})ond as many 
eigenfunctions <p\. and projections of \f/. The ])robability of the value Hi 
for the observable II is and hence the total probability for the p values 

III, 11;, . . . //?, which merge into (G for e —> 0, is ^ If we let e go 

to zero, this expressir)n has (97) for a limit. 

An analogous (aise occurs if the of)erator 0 is incomplete. Tor example, 
let X and y be the variables of the {)rol)lem, and suppose that the oi)crator 0 
contains only x. We know then (see §10) that to the eigenvalue Gr there 
correspond the infinite number of eigenfunctions = u^(y)xr(x) (vvdiere 
the functions n^(y) represent any complete, orthonormal set of eigen¬ 
functions in the function space of ?/). Therefore, according to (97), for the 
probability of the value Gr we shall assume’^ 

00 

^'=2 (97') 

where ^ * (pi. This expression is independent of the choice of the uK 
Taking for these functions the functions d(y ~ y'), where ?/' takes the 
place of the index j (see §14), we have 

^r' = V' ' 5(2/ - y')Xr{x) 

== y)S{y - y')Xrix) dx dy 

= y')Xr(x) dx. 

Hence (97') is transformed into the following integral (where we have 
written y instead of y'): 

Pr^jm^dy. (98) 

From this we get the following rule: To obtain the probability Pr, calculate 
the eigenfunctions Xrix) of the operator 0 in the function space of the 
functions of x alone, and expand the function \l/{x, y) in a series of these 
eigenfunctions, considering it to be a function of x only. Of course the 
coefficients will be functions of y (and may therefore be denoted by ^J). 
The square of the modulus of the rth coefficient, integrated with respect to y, 
gives Pr. 

Evidently, if 0 had continuous eigenvalues G\ in (98) we should have 
to substitute the continuous variable G' for the discrete index r, and we 

This assumption could be justified directly by a limiting procedure 
entirely analogous to that used in the case of multiple eigenvalues of order p. 
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should have to interpret Pa' as the probability density; thus the probability 
for a value lying between G' and G' + dG' would be 

dG'J|^S,pd2/. (99) 

It remains to be seen how we may determine the operator ® 
corresponding to a given observable G. For this purpose, we shall 
proceed by way of successive generalizations. 

Case (a): G = x. If the observable G is one of the coordinates 
of the particle, x for example, the operator @ corresponding to it 
is, as we shall now show, the multiplication by x; that is, T = x 
(it is incomplete, except for the one-dimensional case). In fact, 
as we have seen in §14, such an operator has for its eigenvalues all 
the real numbers x', and for eigenfunctions 5(x — x'). The pro¬ 
jection of the state vector ^ upon this principal axis (calculated in 
the function space of x) is 

'I'x' = y, 2, t) S(x - x') dx = ^{x', y, z, t), 


and hence the probability of finding for x a value lying between x' 
and x' + dx' is, by reason of (99), 

dx'fflHx', y, z, t)\^dydz, 

in full accord with the meaning of as a probability density. 

Case (b): G == px. If the observable G is a component of the 
momentum, p* for example, we may verify that the corresponding 
operator is 

» = A A. 

^ 2Tci dx 


In fact, relation (96) may then be written, p' being a general eigen¬ 
value, as 


h d<pp 




27rf dx 

This equation has any value of p' for eigenvalue, 
integration, it gives 


By a simple 




1 

Vh 


^{2vi/h)px 


( 100 ) 


where the factor 1/\/A has been determined from the normalization 
condition. 



§ 28 ] 


GENERAL THEORy OF QUANTUM MECHANICS 


335 


The projection of f upon the principal axis p' is 

'I'p.'iy, t) = j y, z, t) dx, (101) 


and the probability that the ^-component of the momentum lies 
between p' and pi + dpi is, by virtue of (99), 

dpUn^vA^dydz. (102) 


Now let us calculate the same probability by the principle of 
superposition. If we decompose xj/ into Fourier integrals (con¬ 
sidering it as a function of x only, or considering y and z to be fixed) 
we obtain 







(103) 


where is simply expression (101). Hence xf/p/ measures the 
amplitude of the monochromatic component, of wavelength /i/pi, 
of the de Broglie waves of a particle of given y and z (one-dimensional 
case; see §30 of Part II). Hence, by the principle of superposition, 
dpi is the probability that the particle of given y and z has 
an a;-component of momentum lying between pi and pi + dpi. 
Leaving y and z entirely undetermined, we evidently obtain just 
the value given by (102) for the probability that px lies between pi 
and p' + dp'. 

Case (c): an observable defined as a function of the coordinates or 
of the momenta. Let A be an observable to which a certain operator 
31 is known to correspond, with eigenvalues Ar and eigenfunctions (pr. 
Let G be another observable defined as a function of A, that is, 
G == F{A). The possible results of a measurement of G will be 
F{Ar)j and each of these will have the same probability as the 
value Ar of A. Now it is easily seen that this result may also be 
found by applying the procedure of page 331 and making the oper¬ 
ator ® = F(3l) correspond to G, In fact, as was seen in §10, this 
operator has the same principal axes as 31 (specified by the vec¬ 
tors ^r), with eigenvalues Gr == F{Ar), Hence the projection of 
the state vector upon the rth axis of ® is the same xpr which repre¬ 
sents the projection of ^ upon the rth axis of A ; and thus the proba¬ 
bility of the value Gr is like the probability of the value Ar 
of A, The theorem may immediately be extended to a function 
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F{A, By of several observables compatible with one another, 

and it may be proved in the same way. 

Now'', having already seen that the operators corresponding to 
the coordinates qr are the Qr themselves and that the operators 
corresponding to the momenta pr are (h/2Ti)(d/dqr)y we can con¬ 
clude that to an observable defined as F{q) there corresponds the 
operator F{q)j and that to an observable F{p) there corresponds 
F[{h/2iri){d/dq)]y that is, the operator obtained from the function 
by substituting for each pr the expression {h/2Tri){d/dqr), 

(d) General case. Now let G be any observable, defined directly 
(that is, by operations which do not imply a measurement of the q 
and the p). This is the case of the energy, for instance. No 
general procedure exists for finding the corresponding operator 
However, in almost all the cases w^hich occur in practice, the fol¬ 
lowing heuristic method is successful; it is based upon the postulate 
of a profound formal analogy between ordinary mechanics and 
quantum mechanics. Of course this method will be justified a 
posteriori by the success of its consequences, which have been 
verified in all cases to which the method has been applied. 

Ordinary mechanics permits us to express the value of the 
quantity G at the time t as a function G == F(g, p) of the Cartesian 
coordinates qi and of the momenta pi (and possibly of the time t, 
which we now consider to be fixed). Let us suppose that this 
function may be expanded in a power series, and let us write it 
(if there are terms of the type qTPr) in symmetrized form, as 
explained in §17. After we have done this, the natural generaliza¬ 
tion of what w^e saw above for the case of F{q) or F{p) leads us to 
accept the following postulate to the observable G which in classi¬ 
cal mechanics has the {eventually symmetrized) expression F(g, p), 
there corresponds the operator @ = F[q, {h/2m){d/dq)]. The cases 
already examined under (a), (b), and (c) are evidently included in 
this rule. 

Finally, there are some cases in which it is impossible to make 
an expression of classical mechanics correspond to the observable in 
question. In these cases, the operator @ representing the observa- 

In what is to follow, wc shall need to apply this postulate only to func¬ 
tions of the form F{q, p) =« Q{q) + Pip) + ^ Ariq)prf where only the last 

r 

part requires symmetrization. 
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ble is introduced by means of an ad hoc hypothesis, which is to be 
justified by its consequences. Such is the case, for instance, of the 
components of spin in the Pauli theory (see §45). 

To sum up, the fundamental principle of quantum mechanics 
may be stated as follows. Once the operator @ corresponding to 
the observable G is determined, either by means of the rule cited 
above or by means of a special hypothesis, we write the equation 

(b)ip = GV, (104) 

where G' is a parameter. The eigenvalues Gr of this equation 
furnish the possible results of a measurement of the observable G; 
and if we call (fr the corresponding eigenfunctions, the probability 
of the value Gr in the state ^ is given by \yp • (pr\'^. For cases of 
degeneracy or of continuous eigenvalues, see page 332. 

It is immediately apparent that if we apply this general rule to 
the case where the observable G is the energy of a particle, or of a 
system of particles, subject to forces derivable from a potential 
[in which case the function F(^, p) is the Hamiltonian 5C(g', p)], we 
again, find the procedure of §19 for deriving the Schrodinger equa¬ 
tion for stationary states, or its generalization (88). Hence this 
equation now appears as a particular case of the general problem 
of the search for the possible values of an observable G, and of the 
corresponding probabilities. The particular case in question occupies 
a privileged position in quantum mechanics because of the funda¬ 
mental importance of the observable energy^’ and also because of 
its property of remaining constant. 

23. Note on the use of curvilinear coordinates. The rule of 
the preceding section for finding the operator corresponding to a 
given observable G presupposes that the G is expressed in terms of 
the Cartesian coordinates and of the conjugate momenta pr. In 
many cases, however, it is more convenient to obtain the classical 
expression for G by means of some generalized coordinates, which 
we shall denote by Qr, calling Fr the conjugate momenta. Let 
F(0, P) be that expression. In order to obtain the operator G, it 
is necessary to go to Cartesian coordinates, to perform the substitu¬ 
tion (5), and then once more to transform the operator obtained by 
expressing it in terms of the Q, This process is not, in general, 
equivalent to the simple substitution of {h/2Tri){d/dQr) for Pr in 
F(Q, P), (An example of this procedure has been given in §21.) 
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Take, for instance, the case of a point particle in a plane and 
not subject to any forces. If we use the polar coordinates Qi = r, 
Q 2 = dj and the respective momenta Pi — mf, P 2 = the 

expression for the Hamiltonian is 




In Cartesian coordinates, on the other hand, 5C = (l/2m)(p2 + pj), 
and the corresponding operator is the familiar 


which, when expressed in polar coordinates, is written as follows: 


This expression is not obtained from (105) but from its algebraic 
equivalent: 

by the substitution of (h/2Tt)(d/dQr) for Pr. 

24. Some consequences of the fimdamental principle of quan¬ 
tum mechanics. Let us suppose that the measurement of an observ¬ 
able G has furnished the result Gr- If we indicate by the state 
vector immediately prior to the observation, it is obvious that 
immediately after the observation, the state of the system will be 
specified by a vector which is in general different from that 
is, the perturbation produced in the system by the observation will 
have caused a sudden change of the state vector. We recognize 
immediately that must lie along a principal axis of the operator 
an axis which corresponds to the eigenvalue Gr- In fact, the 
immediate repetition of the observation G would yield the result Gr 
with certainty, and hence the projections of upon the principal 
axes of ® corresponding to the eigenvalues different from Gr must 
be zero, or must be orthogonal to all the other axes. Now first 
let us suppose that to the eigenvalue Gr there corresponds a single 
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axis (that is, that the operator ® is complete and the eigenvalue Gr 
simple). The vector will then be specified (it is sufficient to 
know its direction, since we already know that it is a unit vector); 
that is, the state of the system after the observation is completely 
determined by the result of the observation, without our having to 
know anything about the state prior to the observation.^® Such 
observations, so to speak, completely renew the state of the system 
and permit it to be defined by a number. A typical example is the 
measurement of the energy of a nondegenerate system, in which 
case the states specified in this manner are what we have called 
'^stationary states’’^^ or "definite-energy states.^^ We may now 
characterize them by the property that the state vector lies along 
one of the principal axes of the Hamiltonian operator On the 
other hand, the states which we have called "states wdth indefinite 
energy’^ and which we have characterized in §29 of Part II, taking 
for ^ a linear combination of eigenfunctions ypn, are represented by 
a vector which does not lie along any of the principal axes of 
However, another complete Ilermitian operator & may exist (in fact, 
an infinite number of them do exist),one of whose principal axes 
coincides with the direction of or (assuming that for every 
Hermitian operator there is a corresponding observable) a real 
observable G (other than the energy) exists such that the state 
under consideration (with indefinite energy) may be thought of as 
a "state with well-defined G” and may be characterized by the 
value Gr of G. 


It is apparent that once the vector has been determined at the instant 
immediately following the observation, the further evolution of ^ in time is 
governed by the time-dependent Schrodingcr equation, until a new observation 
perturbs the system. 

The reader will now immediately understand the reason for this name by 
noting that for one of these states the vector ^ has the form \j/n =* 
and hence maintains its direction and modulus unchanged in time (although 
its “phase” varies). 

In the one-dimensional case, for instance, if we let = p [with p{x) 
and 0{x) real], it may readily be verified that this condition is satisfied by the 
operator 


/ h p 

=(p-^e') + — - 


and that the ^ corresponds to the eigenvalue 0. Of course, any function of 
this 0 also satisfies the desired condition (see E. Fermi, Nuovo Cimento VII, 10, 
361 (1930) 
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Let us now briefly consider the cases in which more than one 
principal axis corresponds to the eigenvalue Gr, either because Gr is 
a multiple eigenvalue or because @ is an incomplete operator. In 
these cases will not be determined by the result of the observa¬ 
tion but will also depend on some details of the observation which 
are not reflected in its numerical result,^® In order to establish 
these details precisely, we shall make the hypothesis that the 
observation is to be performed in such a way as to perturb the 
system as little as possible,by which we understand the following. 
One expands in a series of the eigenfunctions <pr of Into this 
expansion there will enter, for each eigenvalue Gr, a certain one of 
the infinitely many eigenfunctions belonging to this eigenvalue. 
We shall say that the observation is carried out with minimum 
possible perturbation if 1 //+ is identical with this given eigenfunction. 
Since we shall always suppose the observations carried out under 
this condition, we may assert that in every case, may be expanded 
in a series of all the possible 

The geometrical interpretation of what has just been said is as follows. 
If the eigenvalue Gr is multiple of order p, there correspond to it an infinite 
number of ])rincipal axes which form a linear manifold in p dimensions 
(orthogonal to all the other principal axes). The vector must in every 
case lie within this manifold, w^hich we shall call V, In what W'e have 
termed the case of minimum perturbation, is identical with the pro¬ 
jection of upon the manifold V. 

The case of an incomplete operator may be made to fall within the 
preceding class by considering p to be infinite. 

26. Criterion for the compatibility of observables. We shall 
now prove a theorem of the greatest importance: The necessary and 
sufficient condition for two observables to be compatible is that their 
operators commute. 

Let A and B be two observables (relative to the same instant) 
which are compatible. Let us measure obtaining Ar, for instance; 
and then B, obtaining B^, Then the system is left in a state in 

For instance, if we measure a; of a particle without simultaneously deter¬ 
mining y and z (or py and pz), the result of the measurement is not enough to 
determine the succeeding state of the system, since some momentum is gen¬ 
erally imparted along the y- and ^-directions while x is being measured (for 
example, with a pinhole camera; see §23 of Part II). This is reflected in the 
fact that the operator x is incomplete. We may, however, arrange the experi¬ 
ment so as to be certain not to alter py and p*, but only pt, and the observation 
will then be carried out “with minimum possible perturbation.'* 
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which a subsequent (immediate) observation of A would yield the 
result Ar with certainty, and an observation of B would certainly 
give Ba- Hence the state vector after the double observation 
will lie along a principal axis common to the two operators 21 and S3. 
Since this statement must hold true no matter what the preceding 
state, the vector no matter what its form, must be developable 
in eigenfunctions common to 21 and S3, or these eigenfunctions must 
form a complete system. Therefore, by the theorem of §11, 21 and 
S3 commute. 

Conversely, if 21 and 23 commute, they possess a complete 
system of principal axes in common (see §11). First let us suppose 
that neither 21 nor S3 is degenerate. Then any principal axis of 21 
is also a principal axis of S3, to which we may assign the same index; 
hence, after we have performed the observation of A with the 
result Arj the state vector will be lying along the principal axis 
corresponding to the value Ar of 21 and Br of S3. Therefore by 
observing B we certainly obtain and the state vector is not 
altered by this new observation. The same holds true if we first 
observe B and then A, Consequently, the two observables are 
compatible. In fact, since to any value of the one there corresponds 
a definite value of the other, they are functions of each other. 

Now let us suppose that S3 is a degenerate (or incomplete) 
operator. Then the observation of A aligns the state vector in 
the direction of the Ar-axis (hence ^ becomes <pr)f which is again 
one of the infinite number of axes of S3 belonging to a certain 
eigenvalue which is multiple of order p (possibly «>). These 
axes constitute a plane manifold V orthogonal to all other axes of 
S3 belonging to other eigenvalues. Hence the subsequent observa¬ 
tion of B must of necessity yield the result Bg and at most will 
rotate the state vector within V, However, since we suppose that 
the observation is carried out with minimum possible perturbation 
(in the sense explained above), the new state vector must be the 
orthogonal projection of <pr upon F, which is evidently identical 
with <pr itself. Hence will at the same time lie along a principal 
axis of 21 and S3. If we now were to observe .5 first (with result J5«) 
and then A (with result Ar)y the first observation would bring the 
state vector into the manifold V ; and since (see §11) within V there 
are p principal axes of 21, whereas all the other principal axes of 21 
are perpendicular to F, the subsequent observation of A will 
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necessarily align the state vector with one of the p-axes contained 
in V (and hence common to A and B). It is to be noted that to a 
value of B there may correspond p values of A, 

Analogous reasoning is applied to the case in which both 
operators are degenerate or incomplete; here the relation between 
the results of the two measurements is even less stringent, or is 
lacking altogether. 

In particular it follows from this theorem that the condition for 
the compatibility of two observations is symmetric, as was pointed 
out in §16. 

26. Commutation relations. By means of the theorem just 
proved, it is possible to arrive again at the well-known fact that a 
Cartesian coordinate qi (i = 1, 2, 3) and the corresponding compo¬ 
nent of momentum pi are incompatible observables. Indeed, we 
have seen that their respective operators are 


Qi — qij 


_ h d 

2Tri dqi 


and evidently do not commute, since we have for any function / 


or 


M ft /i d X h d f h n 

mf - = 2 «. (?./) - 2 -- <?.• 

h 


PiQi - Ciipi = 


27rf 


(106) 


On the other hand, a and a py {j ^ i) obviously do commute; 
that is. 


P;q< - = 0. 

Also evidently, two q or two p commute: 

(1060 

Pipi Pipy — 0, 

(107) 

pypi - pipy = 0. 

(108) 


Relations (106) and (106'), of fundamental importance in quan¬ 
tum mechanics, are known as the commutation relations. 


In quantum mechanics an observable and its operator are often denoted 
by the same symbol (or the corresponding matrix), rather than being 
distinguished by different letters as has been done here. This practice 
leads to writing the commutation relations (106) in the form 


h 

PiQi - QiPi = 
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The paradoxcail aspect of these equations disappears if we recall that 
they do not refer to the physical quantities pi and Qi but merely to their 
operators or corresponding matrices. 

Let us now establish some additional important commutation 
relations. Let Q{qi, < 72 , . . * (?/) be a function of the q alone, and 
let us consider it as an operator O = Q(qi, ^ 2 , . . . q/). For any 
(scalar) function / we then have 


and hence we may write the commutation relation 

h dQ^ 

2Tri dqi 

For a (rational integral) function P(pi, P 2 , . . . vi) of the p 
alone, an analogous relation holds: calling ^ the operator P(pi, p 2 , 
. . . p/), we obtain 

h 


piO - Opi = 


(109) 




27rf dpi 


( 110 ) 


where d^/dp, is to designate the operator which is obtained from the 
function dP/dpi upon substitution of the operators pi for the pi. 
This formula may be derived by observing that ^ will be made up 
of terms of the type api^pj^* . . . p/^, and therefore that it will 
suffice to prove ( 110 ) for an expression of this form. Now all these 
factors except pi» commute with q*; and for p”* the following com¬ 
mutation nile holds [which may be found by successively applying 
(106) rii times]: 


~ q.P»-‘ 




2iri dpi 


Thus (110) is proved. 

From (109) and ( 110 ) we obtain, for a function G of the form 
G = P{p) + Q(g), the following commutation relations (which con¬ 
tain all the previously cited cases as particular instances): 


= 

- q,® = 


h d® 

2iri dC{i 

Jl 

%ci 


( 111 ) 

( 112 ) 
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27. Average values. Quantum mechanics is designed to deal 
with so-called elementary phenomena^ namely, those in which only 
one or a small number of elementary particles are involved. How¬ 
ever, it is important to show that when quantum mechanics is 
applied to macroscopic systems—that is, systems composed of a 
very large number of elementary particles, such as ordinary bodies— 
its results coincide with the results of classical mechanics. There¬ 
fore classical mechanics is not in conflict with quantum mechanics 
but simply represents a limiting case of the latter. 

We shall call ^^macroscopic observation^^ an observation which 
is equivalent to the measurement of the same observable G on a 
large number of equal systems (such as atoms) and to taking the 
average value of the results obtained. The greater part of ordinary 
physical measurements are just of this type. For example, the 
determination of an electric current is equivalent to taking the 
average of the velocities of the single electrons constituting the 
current (and to multiplying it by an appropriate factor); the determi¬ 
nation of the coordinates of the center of mass of a body is equiva¬ 
lent to measuring the coordinates of the separate molecules and to 
taking their average; and so forth. From this introduction it may 
be seen why the study of the average value of an observable in 
an ensemble of systems (of identical constitution) is of interest. 

If all the systems of the ensemble under consideration are in the 
same ‘^state,^’ the ensemble is said to represent a pure case; other¬ 
wise it is called a mixture and may be decomposed into a number 
of partial ensembles, each of which represents a pure case. In 
what is to follow, we shall always refer to a pure case unless a 
mixture is specified. 

If in each system of the ensemble the measurement of the 
observable G can give the results Gr, with respective probabilities Pr, 
the average value G of all the results obtained when measuring Q 
in all the systems of the ensemble will be 

G^^GrPr. (113) 

r 

Now if the state of each system is represented by the vector 
^ (which is the same for all systems) and if we call <pr the eigen- 



§ 27 ] 


GENERAL THEORV OF QUANTUM MECHANICS 


345 


function of the operator ® corresponding to the eigenvalue*" Gr, 
we have 

4'r = ^ • <Pr, (114) 

and the projection of ^ upon the vector tpr is P, = hy the 

fundamental principle of quantum mechanics, and hence 

0 - 

r 

We may transform this formula further by using the fact that 
according to (114). Furthermore, since &(Pr = Gr^rand 
since is a constant, we have 

^ Grypr{(pr * ^ ^r{®<f>r) * V' = (® ^ ‘ 

or ' = (115) 

or else in explicit form 

G = (115') 

where, as usual, dS = dq\ dq^ . • , dq/y and the integral is under¬ 
stood to be extended over the whole space of the q. As may be 
seen, to a definite state \p of the systems there corresponds a definite 
average value for any observable; hence/or macroscopic observations 
there is no uncertainty principle. 

If the ensemble is a mixture, it will be decomposed into partial 
ensembles, in each of which the state of the systems will be repre¬ 
sented by a vector ypK The average value for each of these 
is to be calculated by means of (115). The required average value 
will be 

G = ^ c’&, (116) 

3 

where d = N^'/N is the ratio of the number of systems in the 
state to the total number N. 

If (? is a Cartesian coordinate g,-, relation (115') yields 

qi = kir^dS) (116') 

In order to include cases of degeneracy also, it is necessary to make p 
separate terms correspond to each multiple eigenvalue of order p in (113). 
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Vi = 




(117) 


These formulas are identical with those of Chapter 5, which 
define the center of a wave packet and its average propagation 
vector. 

28. Time derivatives. Classical mechanics as a limiting case. 

Thus far we have considered an observable to be defined only for a 
certain instant t = to. However, it is obviously appropriate to 
introduce something which is analogous to the classical concept of 
‘'physical quantity as a function of time.” Let us therefore consider 
that the same physical process which, when put into effect at the 
time ^ 0 , defines the observable will, when put into effect at 
another instant ty define another observable Gt> In this way we are 
led to consider a continuous succession of observables which is 
entirely analogous to the continuous succession of the values of a 
physical quantity as a function of time in classical mechanics. We 
shall sometimes say, by analogy, that we are dealing with an 
“observable which is a function of A general observable Gt will 
be represented by the same operator @(q, p) which represents fto, 
since the physical process defining it is the same. However, by 
supposing that some of the parameters which enter into the defini¬ 
tion of the process of observation are functions of <, we shall also 
be able to generalize the concept of variation in time. Then the 
operator will contain t as well as the q and p; that is, it will depend 
explicitly on the time (it will then be said that the observable G 
depends explicitly on /). In that case, the eigenvalues and the 
eigenfunctions of G will in general be functions of L In the case 
that G does not contain t explicitly, the eigenvalues are constants, 
and the eigenfunctions are independent of t] that is, the principal 
axes are fixed in Hilbert space. 

We shall now see how to introduce for the observables the 
analogue of the time derivative of a classical physical quantity. 
It is clear that one may not define, for the derivative G of the 
observable G, the expression 


lim 


Gt^dt Gt 

dt 
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since this way of writing has no meaning; Gt+dt and Gt do not 
represent two numbers but two procedures of observation referring 
to different instants, each of which may yield an infinite number 
of numerical results. Instead, it is necessary to have recourse to a 
consideration of the average values and to define G as an observable 
such that its average value (defined as in the preceding section) is 
equal to the derivative of the average value of the observable G.^^ 
Let us therefore differentiate (115), supposing, for greater gener¬ 
ality, that the operator @ depends explicitly on L We have 

0 = 1 ^ 


and also, indicating by d&/dt the operator obtained from the 
expression for ® by the formal operation of differentiation with 
respect to t (hence d(^/dt = 0 if @ does not contain i explicitly). 

Substituting expression (87) for the derivative of we have, 
recalling (5'), 

^ “ X ^ + X 

But since § is Hermitian, we have (see §9) 

m) • m) = m^) • 


This amounts to saying that from an ensemble of a very large number of 
samples of the system as defined in the preceding section (possibly a mixture), 
we select at random a certain (rather large) number of systems, observe Gt on 
them, and find the average value. Then we calculate the numerical quantity 


lim 

dt-*Q 


Ot^dt 


-Gt 


dt 


Finally we postulate that there exists an observable G such that its average 
value (taken from a third group of samples removed from the same ensemble) 
is equal to that quantity, no matter what the composition of the ensemble. 
Thus instead of measuring Gt and Gt^dt on the same system (a method which 
gives rise to the difficulty that the second measurement is carried out on the 
system in a different state from the one before), we perform the two measure¬ 
ments on different systems. 
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and hence 

Q = + x ~ 

Comparing this with (115), we note that G is calculated formally 
as the average value of an observable whose operator is the expres¬ 
sion contained in the square brackets. We shall therefore represent 
the derivative of the observable G symbolically by the formula 

G = ^ ^ (3C(? - G3C), (118) 

to which corresponds the analogous relation between the operators, 
where the operator corresponding to the observable G is denoted 
by di, 

^ ^ ^ ($® - ®$). (1180 

If G does not depend explicitly on ty the first term on the right hand 
side will be absent. 

Here we may give a meaningful interpretation to the derivative G 
defined by (118), or, better, to the differential G dt. Consider the observa¬ 
ble G + G dtj which we shall call and suppose that its measurement 
(at time t) yields the value g\ We shall show that after such an observa¬ 
tion the system is left in a state such that if we measure the observable G 
at time t + dt^ we find the value Thus we may say that the measure¬ 
ment (at time t) of the observable G + G dt serves to determine the value 
which G will have at the time t -[- dL It is to be noted, however, that such 
a measurement is incompatible with a measurement of G at time t. In 
order to prove this assertion, let us call \p the function representing the 
state in which the system is left after a measurement of g has yielded the 
result g\ We shall have 

0^(0 - (119) 

or, simply writing ^ for ^(/), 

®l^ + ^ ($® - ®©)] = {I'lA. (119') 

We are to verify that if we let this ^ evolve for a time dt, we shall 
obtain a -f dt) which is an eigenfunction of the operator corre- 
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spending to the eigenvalue g' —namely, that 

-|- dt) = + dt)j 

or + + = + 

Expanding, and substituting for the derivative of yp the value given 
by the Schrodinger equation (87), we have 

X = d''f' - ^ W4' dt, 

and, because of (119), 


In the last term may be replaced by @ (up to terms in dt^). The equation 
will then coincide with (119'), and thus is verified. 

Let us apply this result to obtain again, in a generalized and 
more precise form, the result that a wave packet moves like a 
point particle in ordinary mechanics. In Part II this principle 
(limited to wave packets sufficiently small to be considered point¬ 
sized) constituted our point of departure for the establishment of 
the Schrodinger equation. 

Identifying G with a coordinate we get from (118') 

q. = ^ m - qi€>); 

and identifying G with the momentum p», we obtain 

p, = ^ - p,^). 


Since the Hamiltonian 3C is of the form P{p) + Q{q), we may apply 
(111) and (112), finding 


<\i = 


dpi’ 


Pi = 


3q, 


According to what was said in connection with expression (118), 
these relations between operators stand for the following relations 
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between the average values of the corresponding observables: 





pi 


d3e 

dQi 


( 120 ) 


Hence, Hamilton's equations hold on the average. For instance, for 
a point particle in Cartesian coordinates, we have 


3 





Pi + U{q\ 


and hence the preceding equations yield 

1 . . W f, 

Qi — Pi, Pt = — ^ = lU, 

^ dqi 

where F, denotes the components of the force. Eliminating pi from 
these equations, we obtain 

( 121 ) 

that is: the center of the wave packet moves like a material point 
obeying classical mechanics, subject to a force which is calculated 
by taking the average value of the force on the entire wave packet 
(Ehrenfest^s theorem). It is apparent that this theorem applies to 
a wave packet of any size, whereas in the considerations which led 
to the Schrodinger equation in Part II, we referred to the limiting 
case of a practically point-sized packet. Furthermore, by applying 
(116), it may easily be seen that the theorem holds also for the 
average values of a mixture. 

29. First integrals. In classical mechanics the jfirsi integral of 
a problem is an expression G{q, p) of such a nature that it reduces to a 
constant if the q and the p vary in time so as to satisfy the equa¬ 
tions of dynamics. 

Analogously, in quantum mechanics we define as first integral 
an observable G such that its derivative G defined by (118) is 
identically zero, that is, such that 


dt 


27ri 

X 


(€)® - m) 


- 0. 


( 122 ) 


It may readily be seen that the energy 5C is a first integral if (and 
only if) d3C/dt == 0, that is, if the Hamiltonian does not contain the 
time explicitly. In that case, the system is said to be conservative. 



§ 30 ] 


GENERAL THEORY OF QUANTUM MECHANICS 


351 


It may be shown that if G is a first integral, its eigenvalues will 
be constants (even if G contains the time t explicitly), and further¬ 
more, although the principal axes of @ are not fixed, their rotation 
in Hilbert space bears a relation to the rotation of the state vector ^ 
(no matter what ^ may be) such that the projections of yp upon these 
axes maintain a constant value. This means that the probabilities 
of the single eigenvalues do not vary in time.^^ 

Generally we shall confine ourselves to a consideration of first 
integrals which do not contain t explicitly. Then (122) becomes 

which expresses the fact that the necessary and suffi^cient condition 
for an observable G {not containing the time t) to be a first integral is 
that its operator commute with the HamiltoniaUj or that the observation 
of G be compatible with a simultaneous observation of the energy. 

30. Angular momenta and their operators. As an application 
of the preceding sections, let us consider the three observables M,, 
Afy, Mzy the angular momenta of a particle with respect to the 
X-, y-y and 2 -axes, and let us first of all find the operators STOy, Wit 
corresponding to them. 

For instance, considering 3/^, we observe that its expression in 
classical mechanics is 

Mz = xpy - ypx. (123) 

Hence, according to the rule of §22, the operator corresponding to 
this expression is 

m. = ^ ^ - y £)’ (124) 

and similarly for M* and My. First we note that any two of these 
observables are incompatible.^® We have, for instance, that 

~ ( 2 /Pz ^Vv) (^Px ^Pz) ~ l/pxpz^? 2^pypx “f" 

WlyWlx ~ xpypzZ xypg z'^pxpy ^ypxpt y 

For a proof, see, for example. No. 4 of the Bibliography, page 229. 

** This incompatibility is the cause of the paradoxical nature of the angular 
momentum quantization in the Bohr-Sommerfeld theory (cf. §56 and §62). 
In fact, Mxy Myj Me cannot be considered as the components of an ordinary 
vector since, by reason of the conceptual impossibility of measuring two of 
them simultaneously, the direction of this vector turns out to be conceptually 
unobservable. 
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and hence, because of (106) and (124), 

= (yp* - xp„)(p.z - zp,) = - 5^. 2«.. 


Therefore the following commutation relations hold for angular 
momenta: 

2K3. - I 

- 2«xa». = - My, > (125) 

5m,.) 

Ziri I 

When vector notation is introduced for the operators, the above 
relations may be expressed by the symbolic formula 

5m X 5m = - --. 5m. ( 125 ') 


Let us now look for the eigenfunctions and eigenvalues of these 
operators. Let us take for example. We note that upon intro¬ 
ducing polar coordinates r, v?, with the 2 :-axis as polar axis, we 
obtain 


d d d 
d(p ^ dy ^ dx 


and hence (124) becomes 


h d 
%ri d(p 


( 120 ) 


Therefore the equation for the eigenfunctions ^(r, B, tp) becomes 


A ^ = jif V 

2%i d<p 


where M' stands for a general eigenvalue. Hence 


yf, = f{r, B) 

w^here / is an arbitrary function of r and B (its presence is due to the 
fact that 9K* is an incomplete operator). This ^ is evidently 
periodic in <p with period hlM'^. However, for V' to have only one 
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value at each point of space, it must be periodic in <p with period 2ir. 
Hence we must have mhIM', = 2x with m an integer, or 

These are the desired eigenvalues; the same would be found for 
Mx and My, 

Therefore the result of a measurement of the angular momentum 
with respect to an axis is always a (positive, zero, or negative) 
multiple of h/2Tr. This result extends and clarifies the '^spatial 
quantization’’ of the Sommerfeld theory (see §56 of Part II). 

We may also readily see that Mg is (as in classical mechanics) 
a first integral if the forces have zero moment with respect to the 
2 :-axis. As a matter of fact, in that case the potential J7, expressed 
in polar coordinates, must be independent of <p; hence the Hamil¬ 
tonian ^ does not contain <p, and therefore commutes with the 
operator (126). 

Let us now consider the observable M, the modulus of the 
angular momentum of a particle with respect to the origin. Classi¬ 
cally we have = Ml + Ml + Ml. Hence for the operator 
corresponding to M^ we shall take 

= mi+ Wy + mi ( 127 ) 

where 9W*, and so on, are given by (124) and analogous expressions. 
Inserting these and taking (106) into account, we find, by a simple 
calculation, 

where S indicates a sum carried out by successively changing x into 
y and z. Now, recalling the meaning of the operators pxj Pv, P*, we 
see that 



and hence 
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We now recall that the Laplacian operator A in polar coordinates 
is given by 



where we have denoted by A the operator, independent of r, 

-j- I giuL 0 1 _|-^^ 

Sin 0 d0\ dd/ sin^ B d(p^ 

Substituting into (128), we find 

(13 J) 

The values of the observable 71/^ are therefore given by — {h'^/AK'^)A', 
where A' is an eigenvalue of the equation 

AyP = A'yp. (132) 

This equation is none other than (223') of §46, Part II—the differ¬ 
ential equation for spherical harmonics {A' corresponding to —C). 
As we have seen, its eigenvalues are A' = —/(/ + 1), with I = 0, 
1, 2, ... . Therefore the values which the observable ilf, or 
angular momentum, may take are given by 

M = VW+T) (133) 

This result has been stated in §46 of Part II. 

It is to be noted that the operator and hence also 2)?, com¬ 
mutes with each of the operators 2JZa:, SDZj/, 3!)Zz. In fact, we have, 
because of (125), 

mm, - = mim, - mmi + mim, - mmi 

= my (^mMy + m.^ - (^mym^ - ^ m.^ my (m) 

+ m. ^mM. - TCj - (mM. + m^ an. = o; 

similarly we show that m^ commutes with SJZy and 9JZ*. Thus the 
measurement of the total angular momentum is compatible with 
the measurement of its projection upon any direction, whereas the 
measurements of two components of the angular momentum are 
incompatible. 
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We recognize, then, that is a first integral (as in classical 
mechanics), if the force is of the central type. In fact, the oper¬ 
ator § for a particle is (see §19) 




STT^^m 


A+ = 






and since A is an operator which does not involve r, it always com¬ 
mutes with the first two terms of this expression. If, then, the 
force is central, 1/ is a function of r alone, and hence A commutes 
also with the last term. In that case, therefore, A commutes with 
and similarly, by (131), for so that is a first integral. 
31. Magnetic forces. In all the considerations thus far we 
have dealt only with particles acted upon by forces derivable from 
a potential U. The physically rather important case of electrically 
charged particles moving in a magnetic field does not fall within 
the preceding considerations and therefore requires an additional 
extension of the latter. This extension, like the previous ones, will 
be carried out by using the considerations of §19 as a guide. 

If a particle of charge e moves with velocity v in an electric 
field E and in a magnetic field H, it will experience a force 


F = c ^E + i V X H (136) 

whose electric part is derivable from a potential U (putting 
eE = —grad U), whereas the magnetic part does not have a poten¬ 
tial, We note that the ordinary dynamical equations of a particle 
may in this case also be written in the Hamiltonian form, if the 
vector potential A is introduced^^ and the function 

r — 1 

is taken for the Hamiltonian. 

We recall that the electric field E and the magnetic field H are derived 
from the scalar potential V and vector potential A by the known formulas 

1 dk 

E =s —grad V -H « curl A. (137) 

C ot 


It is further to be remembered that the potential U used here is equal to 

c7. 
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are 


In fact, Hamilton’s equations as obtained from this expression 

d3C if e .\ 

Qa ^ 1 Pa ^8 I 

dps m\ / 

3 

_ e ^ 1 / ^ a \ 

c m V ' c ■' V dq, dq. 


dq. 


The first equation yields^® 


V- = mq. + - A. 

c 


(139) 


and the second one, through use of equations (137) and (136), yields 
the equation of dynamics 

mq» = eFg, 

Since we want to preserve the analogy mentioned in §19, when 
a magnetic field exists we shall obtain the equation of wave mechan¬ 
ics by transforming the Hamiltonian (138) into an operator by 
means of the usual substitution (S) (of page 322) that is, by putting 

3 

r —1 

and writing the equation for \f/n or for ^ in the usual form (81) 
or (82). We may also say that the operator $ corresponding to 
the presence of a magnetic field is obtained from the Hamiltonian 
with no field by substituting for pr the operator 


0f> ^ A hf d ^ A 

c 2in dqr c 


(141) 


The Schrodinger equation for stationary states is therefore, for 
a particle in a magnetic field, 

3 

r— 1 

Note the fact, expressed by this formula, that in the presence of a magnetic 
field the momentum components pi, pa, ps are no longer the velocity components 
multiplied by m. A fixed particle in a magnetic field has momentum components 
different from zero. 
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h 

2Tri dt 


(142') 


The justification for this extension of the Schrodinger equation 
is given by the following theorem, which we shall merely state here:^® 
A sufficiently small wave packet constructed by means of the xf/n 
satisfying (142) will move in the manner in which a particle of 
charge e would move under the action of forces derivable from the 
potential U, and of the magnetic field derivable from the vector 
potential A. 

The above considerations may immediately be extended to a 
system of N distinct particles. In that case the Hamiltonian oper¬ 
ator is, in the same notation as that of (84), 

3Ar 

where Aj denotes the component of the vector potential correspond¬ 
ing to the coordinate 5 /. 

32. Determinism and quantum mechanics. In rational me¬ 
chanics, as we know, the solution of the following problem is 
uniquely determined: given, at a certain instant ^ = 0, the positions 
and velocities of all the points of a system (subject to forces depend¬ 
ing in a known way upon the positions and velocities), calculate the 
value of any coordinate or component of velocity of the system, 
or any function of these quantities, at any other instant This is 
an analytical property of the fundamental equations of mechanics, 
which, in the field of mechanics, gives precise expression to the 
general philosophical concept known as determinism. 

Let us now state this problem from the viewpoint of quantum 
mechanics. As we have stated several times, no physical signifi¬ 
cance may be attributed to the expression ^‘totality of the positions 

For the proof, see for instance No. 14 of the Bibliography, page 109. 

For a more thorough discussion of this argument, see for example No. 
21 of the Bibliography; also A. Eddington, Sur le prohlhme du determinisme 
(AdttcdUUs Scient. et Ind.), No. 112; Paris: Hermann, 1934. 
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and velocities of the points of a system at a given instant and 
hence the statement cited above, valid for classical mechanics, loses 
all meaning in quantum mechanics. In its stead, the following 
property holds. 

Let us suppose that at time ^ = 0 we have carried out a maxi¬ 
mum observation. Its results represent a complete description of 
the system at the instant considered (or, they define its state’' 
completely). As we have seen, the laws of quantum mechanics 
allow us to calculate, 28 for any instant h (possibly the same instant) 
the possible results of the measurement of any observable G, and 
the respective probabilities. In general, these possible results 
constitute a (discrete or continuous) infinity, which is to be expressed 
by saying that there is a certain ^‘indeterminacy" in the value of 
the observed quantity (an indeterminacy which, we repeat, does 
not arise from an imperfection in the measurements or from a lack 
of completeness of initial information, but rather from the very 
nature of elementary phenomena, such as photon scattering and 
particle collisions, to which the laws of quantum mechanics are to 
be applied). 

^8 It may be useful to retrace the scheme of this calculation, which results 
from the preceding sections. Let gi{q^ p), g^iq^ p) ... be the observables 
measured at time zero (constituting a maximum observation), and gi, gr 2 , . . . 
the values found. The ^ which characterizes the state of the system is deter¬ 
mined, for ^ = 0, by the equations 

= gi'4'ot Mo == 92 ^oj .... (143) 

then evolves in time as governed by the Schrodinger differential equa¬ 
tion 

_ h dyj/ 

which, together with the initial value \^o given by (143), defines ^ at any time 
and in particular ^(<i). Then we write the equation 

= (?V; (145) 

its eigenvalues G' give the possible results of a measurement of G, and the 
respective probabilities are given by |^(^i) • <Pn\^. Geometrically speaking, we 
would say that the observations at time zero define the initial position of the 
state vector vi', whereas (144) governs the manner in which the latter evolves 
in time and hence permits the calculation of the state vector at the time t\. By 
projecting it upon the principal axes of the operator (By given by (145), we 
obtain the required probability amplitudes. 
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However, for certain particular observables, a single value may 
exist with probability of unity,all others having probability zero; 
that is, the observable does have a definite value at time ti. In this 
case, quantum mechanics evidently allows us to calculate this 
definite value without any uncertainty, starting from the initial 
conditions (results of a maximum observation). 

In order to make this reasoning clearer, let us consider the 
example of the linear harmonic oscillator (see §39 of Part II), and 
let us suppose that we have measured its energy at the time ^ = 0, 
finding a result En- This is a maximum observation, which com¬ 
pletely determines the state of the system (in fact, its xp will be the 
eigenfunction ypn of the Schrodinger equation). If then, at any 
time (possibly the same instant), we carry out an observation of 
position, that is, of the observable a:, we may find any value between 
— 00 and + 00 . All we can say a priori concerning the outcome 
of this measurement is that its probability is distributed according 
to the density (that is, according to the curves of Fig. 29). 
However, if instead of measuring the observable .r, we measure at 
time h the observable E (energy), we are certain of finding the 
value En again. There is thus no uncertainty whatever for this 
observable. Naturally, the same conclusion would be valid for any 
other observable compatible with E, 

It is to be noted that the maximum observation to be carried 
out at time zero may be chosen with a large degree of arbitrariness, 
and these different ways of defining the state of the system are not 
equivalent, as was mentioned in §18. In fact, the observables 
having definite values are different in the various cases; this state¬ 
ment is true not only for observables relative to the time zero, but 
also for those relative to any instant ^i. In the case of the oscil¬ 
lator, for example, instead of E we could measure the coordinate 
the momentum p, or any function of these quantities g(x, p). In 
that case, however, it would no longer be the energy which has an 
a priori determinable value at time ^i, but (possibly) another 
observable (?. It may even be shown^® that, given an observable 
G{Xy p) relative to a given instant h, it is always possible to find a 

To be exact, this condition occurs for all those observables whose operator 
has a principal axis in the direction of the state vector ^ at the instant t\. 

Cf. E. Fermi, Rend. Acc. Line. XI, series 6, 1st sem. 1930, page 980, or 
Nuovo Cimento VII (1930), page 361. 
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maximum observation to be carried out at time zero such that its 
result will make it possible—without indeterminacy—to calculate 
the value of G at the time However, if we wanted to calculate, 
without indeterminacy, the value of another observable (?' (which 
might also be the same G at a different instant), we should have to 
perform a different maximum observation at time zero, incom¬ 
patible with the former if G and G' are incompatible with each other. 
In conclusion, we can say that the state of the system at time ti 
may be determined, not only by a (complete) set of observations 
at the time ii, but also by a (complete) set of observations at time 
zero. In other words, the state of the system at time ti is uniquely 
determined by the state at time zero. In quantum mechanics, 
however, this proposition has a meaning different from that of 
classical mechanics, since we have seen that the concept of state’’ 
in quantum mechanics does not imply the totality of positions and 
velocities of the point particles, as it does in classical mechanics. 

It is to be noted that the above considerations apply only to 
elementary” phenomena, that is, to phenomena in which only a 
limited number of atoms, electrons, photons, and so on, intervene j 
whereas for ^^macroscopic” phenomena, into which there enter a 
very large number of elementary particles (that is, for the major 
part of commonly observed phenomena), the ordinary relations of 
determinism naturally hold. This is not in contradiction with 
quantum mechanics; in fact, it is one of its necessary consequences. 
Indeed, a ^^macroscopic” observation (see §27) is equivalent to a 
measurement of an average value on a very large number of ele¬ 
mentary systems. But in quantum mechanics, the average value 
of any observable is uniquely determined when the state of the 
system is known, as was pointed out in §27. Hence determinism 
in the classical sense is valid for the average values. 



CHAPTER 12 
The Matrix Method 

33. General remarks. In Chapter 10 we have seen that, having 
fixed any complete system of orthogonal functions yn (in geo¬ 
metrical terminology, an orthogonal system of axes in Hilbert 
space), we may represent any vector of this space by its components 
/n with respect to these axes. Similarly, every linear operator ?l 
may be represented by a certain matrix {?l}. Let us now apply 
this representation to the state vector xp (with which we have 
dealt at length in the last chapter) and to the linear operators which 
operate upon them and which, as we have seen, correspond to the 
various observables. Hence we are now to consider the state of 
the system as defined by the aggregate of the (infinitely many) 
components of the vector xp lying along the preassigned system of 
axes 2/n, and to make a matrix, rather than an operator, correspond 
to every observable. In this way the algebraic relations between 
observables will now be translated into an equal number of relations 
between matrices. Therefore this method of dealing with problems 
of quantum mechanics is called the matrix method; it is of course 
entirely equivalent, from the theoretical standpoint, to the operator 
method and to the method of wave mechanics which we have used 
thus far. Practical reasons alone determine the preference of one 
or the other method in the various cases. 

In order to understand better the respective positions of the 
two methods, we may compare the operator method with that 
method of rational mechanics which is purely vectorial without 
the use of systems of reference; the wave-mechanical method 
and the matrix method might be compared with the pro¬ 
cedure of classical mechanics using a particular coordinate system. 
More precisely, in the case of wave mechanics, Hilbert space refers 
to that particular system of axes which we have called continuous’^ 
in §2 (each specified by a group of values of the ‘‘coordinates” of 
the system), whereas in the matrix method we refer to a general 
system of discrete axes. 
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The matrix method, as pointed out in Part I, was invented by 
Heisenberg and was the first form taken by quantum mechanics. 
However, the point of view from which it was presented in the 
early days is considerably different from the one just mentioned, 
to which we shall adhere in the remainder of this chapter. 

Let us now briefly outline the fundamental idea of this method. 

First let us fix a system of reference axes in Hilbert space, 
selecting^ a certain observable K and taking for our reference axes 
the principal axes of its operator that is, the directions specified 
by the eigenfunctions yn of the equation 

^J/n ~ -^n^/ny 

we shall then say that the matrices we are using are referred to the 
K-representation” As we shall see, the 5C-representation (where 
5C is the energy) is of particular interest. In this scheme the direc¬ 
tions of the axes of reference are evidently given by the eigen¬ 
functions ypn of the Schrodinger equation. 

Once the representation^^ has been fixed, there corresponds, to 
every observable A, a Hermitian matrix 

All Ai2 Aiz . . . 

A21 A22 A23 • • • 

A 31 A 32 A33 • • . 


If we want to relate this representation to the operator method, 
we must remember (see §5) that the elements of this matrix are 
obtained from the operator 21 (corresponding to the observable A 
according to the rule of §22) by the formula 

Amn == (%n) * Vm = /2/m%n dS. (147) 

These elements are the coefficients of the linear transformation 
which passes from the components of any vector /to the components 
of 21/ or, briefly, of the transformation which expresses the effect 
of the operator. In the matrix method, however, we consider the 
Amn as elements characteristic of the observable, without relating 
them to the expression (147). 

1 If the system has several degrees of freedom, it is to be understood that K 
represents a maximum observable, or a complete set of observables (see §18), 
and the yn are the eigenfunctions common to all their operators. 
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We recall from §12 that, in particular, the matrix which in the 
X-representation stands for the observable K, that is, the same 
matrix which serves to define the representation, is a diagonal 
matrix 


m = 


Ki 0 0 . . . 

0 K2 0 .. . 

0 0 K, . . . 


whose elements are simply the eigenvalues of the operator if 
(see §10) and hence represent the possible results of a measurement 
of K. The elements Amn of the matrix {3lj, however, generally do 
not have any simple physical significance, except those of the 
principal diagonal. In fact, Ann represents [see formula (115), §27] 
the average value of the observable A in the state specified by the 
vector 2/n, that is, in the state defined by the value Kn of K, The 
elements off the main diagonal Amnion 9 ^ n), although not suscepti¬ 
ble of simple interpretation, are nevertheless related in an important 
way to the statistics of the experimental results. In fact, the 
average value of any power A^ of A (in the state defined by Kn) 
is evidently given by the nth diagonal term of the matrix } }, and 

in its calculation the nondiagonal elements of {?() also enter. 

Of particular interest are the elements of the three matrices 
{'SK representing the components of the electric moment 
of the system in the 3C-representation, or the expressions 

Xmn = J^'ZXyl^n dS^ and so on; 

in fact, as was mentioned in §32 of Part II, the radiation emitted 
(or absorbed) in the transition from the state m to the state n 
corresponds qualitatively to the radiation which would be emitted 
by an oscillator whose electric moment has the components Xmny 
Ymny Zjnn, along tlic thrcc axes. 

34. Commutation relations. The algebraic relations between 
observables may be translated into relations of the same form 
between the matrices which represent them, it being understood, 
of course, that the operations of sum and product of matrices are 
defined by the rules of §6. In particular, between the matrices 
{q*} and {pib} representing the coordinates and momenta, the fob 
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lowing commutation relations hold in any representation: 

{p.Uq.l - !q*HP*l =^-(1!, (148) 

{Pt} {qd - IqiUP*! = 0 (for *5^;); (148') 

and, for any k and Z, 

{qfrMqd - {qdfqfcl = 0 (149) 

- (PillP*! = 0. (150) 

If then G is a function of the q and the ;> of the form 

G = P(p) + Q{q) 

(where P stands for a rational integral function, and Q for any 
function), a matrix {®} will correspond to G for which the following 
commutation relations, which follow immediately from (111) and 
(112) of §26, will hold in any representation: 

(151) 

(152) 

In (151) and (152), the preceding relations are contained as special 
cases. 

36. Search for eigenvalues by the matrix method. After these 
preliminary remarks, we shall see how the problem of finding the 
eigenvalues of an observable G (which, in particular, may be the 
energy) presents itself in the matrix method. 

We must start, as in §22, from the analytic expression for the 
observable G as a function of the q and the p. This expression 
takes the place of a definition of G and, as has been mentioned, is 
usually constructed by analogy to classical mechanics. Sometimes 
it is necessary to remember that we must symmetrize'’ the prod¬ 
ucts pkqk in order that the matrix corresponding to G be Hermitian. 
Once this expression G{q, p) has been constructed, it may be inter¬ 
preted as a relation among the matrices corresponding to the 
observables G, g, p: 

{©} =G((q}, {p}) (153) 

which is true in any representation. In particular, in the G-repre- 
sentation the same relation will hold, with the additional condition 
that the matrix {©} be diagonal. ^Hence if we represent {@} in 
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that representation, the problem is to determine the elements of 
the matrices {q} and {p} in such a way that the latter satisfy the 
commutation relations (148) and (150), and furthermore that the 
matrix {@} calculated by means of (153) be diagonal. After solu¬ 
tion of this problem (for which no general methods can be given), 
the elements of the diagonal matrix {&} will yield the required 
eigenvalues. We shall show an example of this procedure in the 
following section. 

36. Application to the problem of the harmonic oscillator. Let 

us take the case of a harmonic oscillator of mass m and restoring 
force --Kx, treated by wave-mechanical methods in §39 of Part II, 
and let us try to find, by the matrix method, the values its energy 
may assume. 

The expression for the energy as a function of x and p == mx 
(Hamiltonian) is, in analogy to classical mechanics, 


We are to determine the elements of the matrices {y} and {pj 
(referred to the X-representation) such that the commutation 
relation 


hold, and such that the matrix 




be diagonal. Translating these matrix equations into equations 
for the corresponding elements, and indicating by En the diagonal 
elements (i?nn) of the matrix {§}, that is, the desired eigenvalues, 
we have^ 





I 

Vrk + 2 


r 


r 


* In this problem, we shall number the rows and columns of the matrices 
from 0 rather than 1, in order to conform to the convention adopted in the 
wave-mechanical treatment of the same problem, in which we have numbered 
the eigenvalues Ei, and so on. 
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Then applying to the matrix {^} the commutation rules (151) and 
(152), we find 

- isiim 

These equations may be translated into the following relations 
between the matrix elements: ^ 



(PjrHrk 

r 




(j^jrUrk 

T 



or 

Pjk{Ek 

II 

1 

(158) 


XjkiEk 

T-»\ hi 

(1580 

from which we get, upon < 

eliminating pjky 



Xjk 1 - 




or else, upon introducing the classical frequency of the oscillator 
1 iK 

27r \ m 


^jk 


1 - 



= 0. 


(159) 


From this it may be seen that the elements xjk are all zero, except 
for those whose indices j and k are such that 


Ek — Ej = ±hvQ. (160) 

An analogous observation may be made for the elements pjk- 
Therefore in the matrices {f) and {p} there are at most two ele¬ 
ments different from zero in each row and each column. 
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Now the commutation relation (156) yields in particular, for a 
diagonal element {j == k), 



- XkrVrk) 


Jl. 

2Tri 


Solving for the p in (158) and noting that Xkr = we have 



r 


. .) ( 101 ) 


From this expression we see that in each row of the matrix {;c} 
there is at least one nonzero element; otherwise, if the A:th row 
consisted of all zero elements, the summation of (161) would be 
zero and the relation could not hold for this value of k. Hence if a 
certain number Ek is an eigenvalue, at least one of the quantities 
Ek ± hv(s is also an eigenvalue. This condition is satisfied if the 
eigenvalues form an arithmetic progression of constant difference 
Aro, that is, when they are given by the formula 

Ek== € + khpoy (162) 

where € is a constant which we shall determine prCvSently. When 
the eigenvalues are numbered in this fashion, condition (160) is 
satisfied only for ^ — k ± Hence in the matrices jy} and jp}, 
only the elements of the type pk,k±i are different from zero 

(these elements form two oblique lines, parallel and adjacent to 
the principal diagonal on either side). Expression (161) then yields, 
for = 0, 

2K = 1, (163) 

and for A; = 1, 2, , 


2K 


hvo 


Avo J ’ 


which may also be written 




hvo 


(164) 
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showing that the quantities form an arithmetic progression 

of constant difference hvol2K. The first term of that progression 
is given by (163), and is 

Ixoil^ = (163') 


The general term is therefore 


\^k.. 


fc+ir 


hvo 


= (A: + 1) ^ = (k + 1) 


STT^mj'o 


Consequently, for the elements Xk,k^i of the matrix {f} we may 
take^ 


Xk,k-^i = \/A: + 1 ^ 


h 


ST^mvo 


(165) 


We can deduce the elements of the form Xk,k-i from these, observing 
that since the matrix must be Hermitian, = Xk-i,k, and that 

this last quantity is obtained from (165) by simply changing k 
into (fc — 1). Hence 


Xk,k-^l 




h 


Sir I’D 


(165') 


From these formulas by means of (158) or (158') we obtain the 
expressions for the nonzero elements of the matrix {p} as follows: 


Pk.k+1 = -i Vk + 1 


Pk. 


,k-i = i V* 


^mhvo 


(166) 

(166') 


Thus the matrices jy} and {p} are completely determined. In 
order to find the energy levels, we have only to determine the 
constant c of (162). This may be obtained immediately by writing 
(157) for the particular case j = A; = 0. We have 

* Of course we could add to these expressions a factor of the form with 
h arbitrary, but the eigenvalues would be the same, as may readily be recog¬ 
nized. This operation would correspond to a multiplication of the versors 
specifying the axes in Hilbert space, by factors of modulus 1, which do not 
alter anything of importance. 
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and taking |a:oi|^ and from (163') and (166), we find 



Hence in (162) we must take e = /iro/2, and the expression for the 
energy eigenvalues becomes 

Ek = {kk + ^)hvo (167) 

which coincides with the expression found by wave mechanics in 
§39 of Part 11. 

It may easily be verified that the expressions found for the 
elements of the matrices {f j and {p} also satisfy the relations (156) 
and (157) (which we have specialized by using only j = k) for 
j 5^ k. 

The elements of the matrices {%} and {p} which we have calcu¬ 
lated (and which enter into the problems of radiation theory) could 
also be calculated by means of their wave-mechanical expression 
[see (147)]: 

^jk = ji^*x\kk dx (168) 

with the expressions found in §39, Part II, inserted for the eigen¬ 
functions However, this procedure would lead to calculations 
considerably longer than those developed in this section. In the 
case of the oscillator, therefore, the matrix method presents certain 
advantages over the Schrodinger method. 
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37. General remarks. It is a well-known fact that in celestial 
mechanics the problem of the motion of the planets under the 
action of mutual attractions and of the attraction by the sun would 
be practically insoluble if all these forces were of the same order of 
magnitude. Fortunately, the attraction of the sun by far exceeds 
the attraction between the planets, so that the motion of a planet 
can be calculated by a method of successive approximations. This 
so-called ^^perturbation method^’ is capable of an accuracy which 
is more than sufficient for practical purposes. Essentially, it con¬ 
sists in first considering each planet to be subject to the solar 
attraction alone, under which assumption the problem is known 
to be rigorously soluble; then we calculate, by successive approxi¬ 
mations, to what extent the motion is modified by the effect of the 
originally neglected mutual attractions (perturbing forces). An 
analogous procedure is followed in atomic physics in order to solve 
those problems in which the particles may be considered to be 
subject to certain predominant forces which, if they alone were 
present, would allow the complete solution of the problem (^^unper¬ 
turbed problem and subject also to other weaker perturbing 
forces. Thus, for instance, in the study of an atom located in a 
magnetic field (Zeeman effect), we can consider the action of the 
field upon the electrons of the atom as a perturbing force; supposing 
that we know how to solve the problem (that is, how to determine 
the eigenfunctions and eigenvalues) for the atom in the absence of 
the field, we can, by the method about to be described, determine the 
perturbation produced by the magnetic field on the eigenfunctions 
and eigenvalues. 

Perturbation theory is of very great importance in atomic 
physics, since many problems of great experimental interest would 
lead, either by the method of wave mechanics or by the matrix 
method, to mathematical difficulties which would be either very 
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acute or altogether insuperable, were it not for the perturbation 
method. This method may be adapted to the wave-mechanical as 
well as to the matrix approach, as we shall show in the following 
sections. 

38. Perturbation of stationary states (nondegenerate case). 

First let us consider the unperturbed system, and let us call 5C^(g, p) 
the Hamiltonian which describes it (we shall generally distinguish 
the quantities referring to the unperturbed problem by a super¬ 
script 0). Then the Schrodinger equation for the ^th stationary 
state of the unperturbed system will be 

(169) 

where denotes, as usual, the operator obtained from 5 C°(q', p) by 
substituting {h/2‘7ri) {3/dqr) for every Pr, according to the rules of §22. 
Let us suppose that we know how to solve the unperturbed problem 
completely^ that is, that we know all the eigenvalues jfi'? (which we 
assume to be discrete)^ and the respective eigenfunctions Let 
us fix our attention on a definite state, the nth state for example, 
and let us determine the effect upon the latter due to the perturbing 
forces, that is, the modifications produced in and ^51. We shall 
assume in this section that the eigenvalue El is not multiple. The 
case of multiple eigenvalues {degeneracy) requires separate treat¬ 
ment, which will be described in the following section. On the 
other hand, the other eigenvalues may be multiple, but in that 
case each of them will be counted as p coinciding eigenvalues 
(denoted by distinct indices), if it is multiple of order p. 

The perturbing forces will be represented by a term £(g, p) 
added to the Hamiltonian, which will now be 5C = X® + £. Hence 
the corresponding operator will become + ?, where S is obtained 
from £(g, p) by the usual substitution (including the eventual 
symmetrization of the products p7g)?, and is Hermitian. If, for 
instance, the perturbing forces are derivable from a potential u{q)y 
we shall have (recalling that X° = T + J7) 

g = £ = u{q). 

1 If the eigenvalues are partly discrete and partly continuous, and if the 
eigenvalue which we are considering is discrete, the formulas of this and of the 
following section will still hold, provided we replace certain summations by 
integrals. If, however, the eigenvalue considered belongs to the continuous 
spectrum, a somewhat different procedure is required (see, for instance, No. 1^ 
of the Bibliography, page 157), 
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In more general cases, as with magnetic forces, £ will depend on 
the p in addition to the g, and therefore 2 will also contain 
derivatives. 

We shall suppose in every case that £ does not contain t ex¬ 
plicitly; this supposition may be expressed by saying that the 
perturbation is “time-independent.’’ The opposite case will be 
discussed in §41. 

The Schrodinger equation of the perturbed system for the nth 
stationary state will be (where En = El + e and = ^2 + X are 
called, respectively, the perturbed eigenvalue and the perturbed 
eigenfunction) 2 

+ mn = Er.^Pn. (170) 

The function x, which represents the effect of the perturbation 
upon may be expanded in a series in terms of the unperturbed 
eigenfunctions (which, as we know, form a complete orthogonal 
system) and may therefore be written 

00 

^ (171) 

where the a« are a set of constant coefficients which define the effect 
of the perturbation upon Substituting this expression into (170) 
and taking (169) into account (for i = n), we have 

(-e + + 2 -K-^ + 2)r. = 0. 

8 

Multiplying both terms by (from the left) and integrating 
over all g-space by making use of the conditions of orthogonality 
and normalization of the we indicate, as usual, by Lra the ele¬ 
ments of the matrix which represents the operator 2 in the X°-repre- 
sentation (perturbation matrix), as follows: 

Lr8 = (172) 

We then get 

c(l + dn) Enn + 2 (173) 

8 

2 We should write xn, and later on, am, because for every value of n 
there exists an e, a x> and a system of coefficients a. For the sake of simplicity 
we shall omit the index n, which may be considered fixed during the whole 
development. This practice will be followed frequently in the following pages. 
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Similarly, multiplying by (we indicate by a an index taking on 
all positive integral values except n) and integrating, we have 

(la{En + e) == jL<rn + ^ CtaEag (c 5*^ Tl). (174) 

8 

All the formulas derived thus far hold rigorously, no matter 
what the order of magnitude of <£. Making use of the circumstance 
that the effect of the perturbation is small, let us consider the L^*,, e, 
and the a« as small quantities of the first order.^ If we then neglect 
in (173) the quantities of the second order, that expression yields 
for € a value in first approximation which we shall denote by e': 

e' = Lnn. (175) 

Therefore: To a first approximation, the perturbation of the nth 
eigenvalue is given by the nth diagonal term of the perturbation matrix, 
or also, if desired (see §27) by the average value of £> calculated for 
the nth stationary state of the unperturbed system. This result is 
entirely analogous to a known theorem of classical mechanics, 
according to which the correction to be applied to the energy of a 
system due to the effect of a perturbing force is equal to the average 
of the potential of this force, taken over the unperturbed motion. 

It is to be noted that in order to calculate (in first approximation) 
the perturbed eigenvalues, it is not necessary to know the perturbed 
eigenfunctions ^n. Often the latter are not of interest and can be 
dispensed with. If we want to know them to a first approximation, 
we must solve (174) for the a^ which, if the terms of second order 
are neglected and the first approximation of a^ is denoted by af^ 
yields 

E,° ^ 

In order to obtain all the terms of the expansion (171), we still 
have to know a„. This is determined by imposing the condition 
upon that it be normalized. Neglecting quantities of the second 
order and denoting, as usual, the first approximation by a prime, 

* More precisely, we suppose aU the Lno to be small of first order compared 
with the differences — iS'<r®. It follows that « and the a are also small of 
the first order (compared to En^ and 1, respectively). 
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we then obtain 

00 00 

« =* 1 « = 1 

that is, a' + a'* = 0, 

which m(3ans that a' must be pure imaginary but otherwise arbi¬ 
trary (provided it is small of first order). If we take it to be zero/ 
we may write (171) as 



c^n 

As far as the intuitive aspect is concerned, it might be mentioned 
that the effect of the perturbation upon the nth stationary state 
is one of “admixing’^ to the eigenfunction each of the other 
in an amount which is the larger, the closer the respective energy 
levels lie to the level considered, and the more appreciable the 
element L<rn of the perturbation matrix. 

Let us now proceed to the calculation of the eigenvalues in 
second approximation. In order to obtain e from (173) in second 
approximation, it is necessary to insert the values of the a; but 
since the latter appear multiplied by quantities of the first order, 
it will be sufficient to introduce the values of the first approximation, 
namely, (176) and = 0. Denoting by c" the terms of second 
order of c, we then obtain 

+ e" = L„„ + ^ (<^ ^ n). (178) 

a 

Continuing in an analogous manner, we would calculate the 
eigenfunctions of the second approximation, and by means of the 
latter, the third approximation (c' + e" + e'") of €, and so forth. 
These formulas are omitted here. They are rarely applied, since 
the first or second approximation for the eigenvalues and the first 
approximation for the eigenfunctions usually suffice. 

^ The arbitrariness of an reflects the arbitrary nature of the phase” of 

and has no practical consequences. In fact, putting an ** t5n (with 5« real 
and small of first order), we obtain to a first approximation, 1 + a» » and 
since in (171) appears just multiplied by (1 + Un), it suffices to substitute 
for to get back to the case of Un « 0. 
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39. Perturbation of stationary states (degenerate and quasi¬ 
degenerate case). In order for the approximations developed in 
the last section to be valid, it is necessary, as we have pointed out, 
that we have 

|L.„1 « \El - E.% (179) 

This necessity is apparent from formula (170), which shows that 
if for some a this condition is not satisfied, the corresponding a/ is 
no longer a small quantity of the first order. The condition (179) 
will no longer be satisfied if the displacement of the ?ith level pro¬ 
duced by the perturbation is of the same order of magnitude as the 
distance from the level to the neighboring levels. Therefore, if the 
level in question is rather close to some other level, even a slight 
perturbation will no longer permit the preceding arguments to be 
applied. As a limit of this situation, we may consider the case in 
which the level En is multiple (degeneracy), since we may imagine 
that we approach this condition gradually upon letting one or more 
of the other eigenvalues E^r approach En until they coincide. Hence 
in this section we shall investigate how the perturbation method 
is to be modified in the case where there is a group of eigenvalues 
(which we shall call Eiy E 2 , . . . Ep) lying very close to one another, 
in such a way as to form a rather dense group called “multiplet'^ 
(quasi-degeneracy) or actually coinciding so as to form a multiple 
eigenvalue (degenercu'.y). Degeneracy will be considered as a limit¬ 
ing case of quasi-degeneracy. It is to be noted that these circum¬ 
stances, which mathematically seem to be exceptional, are actually ^ 
realized in the major part of problems of physical interest. In 
particular, energy levels are as a rule multiple in all problems with 
spherical or cylindrical symmetry (as we have seen, for instance, 
in the case of hydrogen). If we then take into account the cor¬ 
rections for spin and relativity which will be introduced in what 
follows, some of the coincident energy levels separate slightly, giving 
rise to the fine-structure,^’ or to multiplets. That is, we pass 
from complete degeneracy to the quasi-degenerate case. (Each 
component of the multiplet, however, may in turn present complete 
degeneracy.) 

In this whole section let us adopt the convention that a letter 
ifjykjlj . . . indicates any index (taking on the values 1, 2, . . . p) 
which serves to distinguish between the various components of the 
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jnultiplet, and that the Greek letters p, o’, . . . will serve as indices 
(assuming all integral positive values except 1, 2, ... p) referring 
to states which are not part of the multiplet. Let us then indicate 
(in conformity with the notation adopted above) by E\ and the 
unpert^irbed eigenvalues,^ by Ei and E^ the perturbed eigenvalues, 
and by and xpa, respectively, the eigenfunctions corre¬ 

sponding to them. It is then convenient to introduce the average 
energy level of the multiplet, that is, the quantity 


and to put 


Im 

Eo = —> 

P 

Ei = iJo + ei, Ei ^ Eo + ei. 


(180) 

(181) 


In this way the quantities e turn out to be small (of the first order) 
compared with the E, It is apparent that in the case of complete 
degeneracy, the c? are all zero. 

After these preliminaries, we write the Schrodinger equation for 
the unperturbed states as follows, denoting as before, the unper¬ 
turbed Hamiltonian operator by 

= EUi, ( 182 ) 

Fol the ith perturbed state we shall have instead 

+ S)^i = Ei^i, (183) 

Now it is to be noted that, in contrast to what occurs in the 
previous case, the perturbed eigenfunction generally does not lie 
close to the unperturbed eigenfunction but differs from it (and 
from the other ^2) by terms which may not be considered small. 
This condition may be foreseen intuitively from the last section, 

* since the closeness of other energy levels to the level Ei gives rise, in 
the summation of (177), to terms which are no longer small with 
respect to V'n- other words, the various eigenfunctions of the 
multiplet without any one of them being predominant over 

the others. Therefore, if we expand ^i in a series by means of the 

® The levels may also be grouped, either entirely or partially, into 
multiplets or multiple levels. This fact does not change anything in the fol¬ 
lowing formulas, provided that to every energy level of multiplicity n we make 
correspond n values of the index <r, as if dealing with distinct levels. 
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ortKogonal functions (which together form a complete system), 

we have 

iZ-i = ^ 2 (184) 

We may consider the a as small of first order with respect to unity, 
and the c as of the order of magnitude 1 in general. Substituting 
into (183) and using (182), we obtain 

2 -Ei + ^ m. = 0 

or else 

2 + X + ^'>'1'" = 0 - 

j <r 

Multiplying by ypk* and integrating over the whole range of the 
coordinates, we obtain, by making use of the orthogonality and 
normalization of the and introducing (172), 

^ ■“ + Lkj] + ^ diaLkc = 0. 

j <r 

Since the L are small of the first order, like the a, the second sum¬ 
mation will be negligible in first approximation. If we insert for 
the dj and €{ the values of first approximation, which we shall call 
c% and e', we are left with 

X <•[(«"-+ = 0. (185) 

j 

Having fixed the value of t, and giving to k the p values 1,2, ... p, 
we obtain from this formula a system of p linear homogeneous 
equations in the p unknowns c?i, c%y . . . c%. In the coefficients 
of this system there occurs, besides the known quantities Lajj, 
the still unknown quantity 

Now for the system to have solutions which are not all zero, 
the determinant of the coefficients has to vanish; that is, we must 
have 


Ln + cj — cj 

Li2 

Lu 

. . . 

7^21 

L22 + C 2 

L 2 Z 

. . . 

Lzi 

Lzz 

Lzz + €3 - 

t 

- • • 


= 0. (186) 
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From this equation of degree p we can get ej. This is of the 
form called secular equation(see §12); and since L,* = its 
p roots are all real (and for now we assume them to be distinct). 
It is to be noted that the p systems of equations which are obtained 
from (185) by giving to i the p values it may assume, all lead to the 
same secular equation, so that the p roots of equations (185) may 
be taken as the values of ei, . . . e^. In order to establish to 
which root each of the indices 1, 2, . . . corresponds, it suffices to 
observe that if all the Ln: tend to zero, the last equation approaches 

( e « - 6 ')( 6^2 ~ €0 • - • = 0 , 

and hence one of the roots tends toward €?, one toward €% and so on. 
The correlation is made with the aid of this criterion. 

Thus, from a solution of the secular equation (18G), we have, 
to a first approximation, the perturbations of the eigenvalues of 
the multiplet. 

From each of the systems (185) we then obtain the values 
. . . c?p to within a constant factor. This factor is deter¬ 
mined by imposing the condition 

2 141^ = 1- (187) 

3 

With this the c% will have the property® 

^ = ha, (188) 

3 

which may be expressed by saying that they form the coefficients 
of an orthogonal transformation.” 

If we then put 

Cij C^j “f“ 

where the first term is of the order unity and the second term is a 
small first-order correction, we may write (184) 

2 ^ 2 

3 3 <r 

or also 4'i = + ^ 4'A? + ^ (189) 

3 <r 

® Indeed, it may be shown that by using (185) and the relation « L*<*, 
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where we have put 

= ^ c%^^. (190) 

3 

In (189), represents the principal term. As may be seen, the 
unperturbed eigenfunction is approximately (to within terms of 
the first order) equal, not to but to The if, may be called the 
eigenfunctions of the zero-order approximation. 

The limiting case of complete de^generacy deserves separate 
consideration. In this case we deal, not with a multiplet, but 
actually with a level of multiplicity p. Here (185) becomes, since 
the e] are zero, 

^ = 0, (185') 

y 

and the secular equation to be solved becomes 

Lii — e'i Li2 Lu 

L21 L22 “ L23 

Lzi Lz2 LzZ — €i 


To its p real roots we may assign the indices 1, 2, ... p in any 
order. The effect of the perturbation is to “ split the energy 
level i^o into a group of p neighboring levels, given by Eo + c(, 
Eq + € 2 ) • • • y to & first approximation. To the level Eq there 
corresponded p stationary states, different as far as the ^ are con¬ 
cerned, but identical as to energy. By the effect of the perturba¬ 
tion each of these states acquires a slightly different energy. We 
may say that the perturbation removes the degeneracy.^ 

The following consideration will furnish an intuitive picture of 
the fact that the perturbed \(/i are not, in general, approximately 
equal to the , but rather to certain of their linear combinations 
From §6 of Part II, we know that to a multiple eigenvalue (of 
order p) we may attribute an infinite number of systems of p 
(orthonormal) eigenfunctions, systems which may all be obtained 

^ If the secular equation had two or more coincident roots, the degeneracy 
would be only partially removed; that is, the level Eo of multiplicity p would 
split into a group of levels, some of which would still be multiple (of order < p) 
in spite of the perturbation (at least to a first approximation). This case occurs 
frequently. 


= 0 . ( 200 ) 
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from any one of these systems by an orthogonal linear transforma¬ 
tion. The rpi represent any one of these systems (selected at 
random). Now we think of the perturbed state, which is not 
degenerate (hence the \[/i represent a well-defined system of eigen¬ 
functions). If the perturbation is permitted to go to zero, these 
will tend toward a well-defined system of unperturbed eigen¬ 
functions and there is no reason why this should just be the sys¬ 
tem of the In general, it will be related to the latter by a 
certain linear orthogonal transformation. The systems of equa¬ 
tions (185) merely serve to find the coefficients c% of that trans¬ 
formation, and hence to determine, by means of (190), the which 
by virtue of (188) will be normalized and orthogonal to each other 
(but not to the 

Let us now once again take up the general case, and let us try 
to find (to a first approximation) the perturbed eigenfunctions given 
by (189). It is convenient to transform this formula further by 
also expressing the second term in terms of the ^ instead of the 
Thereupon the formula becomes 

^ ^ ( 201 ) 

I a 

where the coefficients y\ii (small of the first order) are related to the 
cjy by the following linear relations, which are found immediately 
by using (190); 

4 = 2 

I 

Let us now express the fact that xpi satisfies the Schrodinger equa¬ 
tion (183), namely, that 

($« -£< + ?) (^.- + 2 Vah + 2 “ 0 - 

I 9 

Upon developing this expression, we note that, because of (190) 
and the first equation of (182), 

= 2 + 1 
j 3 
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and similarly for Furthermore, we recall the second equation 

(182) and take (202) into account. We then obtain 

^ ~ ^ ^ fjcli'l'i 

j I j 

+ ^ ViiLif/i + ^ aia{Ea^ — Ei)\pa^ + ^ a,vL^/ = 0. 

l <T a 

From this we may first of all obtain the a by multiplying the equa¬ 
tion by i/'p®* and integrating. Then, since yj/p^ is orthogonal to 
to the to the J/z, and to all the for which (t 9 ^ we obtain 

Lpi + ^ rjnZipi + citp(Ep^ — Ei) + aipLpa = 0, (204) 

where we have introduced the following notation, analogous to 
(172): 

Lpi = Sip^*^h dS. (205) 

Equation (204) was obtained without approximations. Its first 
and third terms are small quantities of the hrst order, the others 
of the second order. Neglecting the latter and replacing Ei by Ei, 
we obtain for the a»p the value in first approximation. 

To solve for the rj and the second approximation of the €, we 
shall proceed in analogous fashion, multiplying (203) by and 
integrating. First, though, we note that 

I 

and we put Lki = dS. (205') 

Hence we obtain, from (203), 

— CiSiAj + ^ + Lki — ^anih + ^ €yCjyCfc* 



( 207 ) 
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Now we see that because of (190), equation (205') may be 
written thus: 

i-M = / ^ dS = y circ^Lir, 

hi hi 

and since (185) may be written (by changing the index k into 1) 

y 4.L,y = 4(*: - *?), 

3 

we have Lh = ^ 4*4(e< — «"), 


and by changing the summation index I into j and using (188): 

Lh = €'5.* - y (208) 

3 

If we substitute this expression for the third term of (207), we 
see that the second term of the latter cancels against the summation 
of (208). If relations (208) and (202) are taken into account, the 
next to last term of (207) transforms into 


^ VilLkl = ^ Vil ^ 

I I 3 

= 4vik — ^ VilC% = ^ 

3 I 3 

Thus (207) finally reduces to 


— (c, — «') 8ik — (e* — €k)vik + ^ CliaLko == 0. (209) 

a 

From this we obtain, ior k 9 ^ f. 



a 


or else, in first approximation, utilizing (206), 



<y 


{i 9 ^ k). (210) 


In order to complete our knowledge of the perturbed eigen¬ 
functions in first approximation, we still need the coeflScients 
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which are determined (like ctnn in the preceding section) from the 
normalization condition for and again we find that they may be 
taken to be zero. 

Let us proceed to the determination of the perturbed eigenvalues 
in second approximation, which is, fairly important in practice. 
These eigenvalues are obtained from (209) for k = i. In fact, if we 
designate by e^' the terms of second order and neglect those of 
higher order, that is, if we let €i — e'i + then (209) yields, for 
k = 

— — ei'rjii + y aiXia = 0; 


or, if we neglect the second term (which was taken to be zero and 
is of the third order in any case) and use (206), 



<r 


Hence the second approximation for the eigenvalues of a multiplet 
or of a multiple level, is given by 


Ei = Eq + + 


X 


El - jE'/ 


( 212 ) 


40. Perturbation theory by the matrix method. The perturba¬ 
tion theory of stationary states developed in the previous sections 
may naturally be presented from the point of view of the matrix 
method as well, which, as we know, leads to results equivalent to the 
results of wave mechanics. We shall now show this presentation 
as an example, confining ourselves to the first approximation and 
to the nondegenerate case. 

If we assume for our system of reference in Hilbert space the 
system defined by the that is, if we refer to the ‘^5C-representa- 
tion'^ of §33, the unperturbed Hamiltonian operator is repre¬ 
sented by the diagonal matrix 




El 0 0 . . . 

0 0 . . . 

0 0 El .. . 
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The perturbed Hamiltonian operator § == + 8 will instead be 

represented, with respect to the same axes, by a nondiagonal matrix 
{$j, in which, however, the off-diagonal elements are small com¬ 
pared with the diagonal elements. We are to effect a change of 
axes of reference such that with respect to the new axes, this oper¬ 
ator will be represented by a diagonal matrix {§}', whose elements 
will give the perturbed energy levels Er,* For this purpose we 
recall from §7 that to a change of axes there corresponds a trans¬ 
formation matrix {©} and that the transform {$}' of the matrix 
{§} is given by {§}' = that is, we have 

{©}{€)}' = {€)}!©}. (213) 

In the actual case, since the matrix {$} is gwas/-diagonal, the 
transformation matrix rendering it diagonal will be little different 
from the unit matrix {1 j (or it will define a very slight rotation of 
the reference axes), and hence we shall write it in the form 

m - {1} + {a}, (214) 

where the elements Um of the matrix {a} are small quantities of 
the first order, which we are to determine. Equation (213) then 
becomes, if we substitute {§} = + {8} and neglect the 

product of the second order {8} {a}, and therefore replace {a} by its 
first approximation {a' j, 

{§}' = {€>^1 + { 2 } + 

or, since we may replace {$}' by in the last term, making an 
error of the second order, we have 

m' = m + {?} + - {«'}{$“}. ( 215 ) 

If we recall that the elements of are of the form JTJ, = 
whereas the elements of {vf)}' must be of the form = Er5r,, we 
see that the relation (215) may be translated into the following 
relation between the elements: 

EJnm = Eldnn. + Lnn. + ~ El)a'^^. (216) 

From this equation, we may simultaneously obtain the perturbed 
eigenvalues, even without determining the In fact, for m « n 

relation (216) becomes 


En — En "+■ Lnnf 


( 217 ) 
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and thus we find the result (175). On the other hand, for m 9 ^ n 
we obtain 




(218) 


The diagonal elements ann remain arbitrary (provided they are real), 
and may be taken to be zero. 

Having thus determined the transformation matrix {©}, we 
recall that the versors ypn of the rotated axes are obtained from 
those of the original axes by means of the formula (see §7) 


m 

or, since + a„„, 


Since ann = 0, it will be convenient to use, instead of the index m, 
the summation index <r (which excludes the value n). With this 
substitution and with the first approximation values (218) intro¬ 
duced for the a, the formula becomes identical with (177). We 
note that the coeflScients a of §38 (where we have neglected to write 
the index n) are identified with the matrix elements of {aj of (214), 
and hence define the infinitesimal rotation which brings us from 
the axes in which is diagonal to the axes in which {§} is 

diagonal. This is the interpretation which, in Hilbert space, must 
be given to the process of approximation developed in §38. 

41. Perturbation of nonstationary states. Time-depen dent per¬ 
turbations. Let us now treat the perturbation problem in a more 
general manner, so as to include also the case of states resulting 
from the superposition of several stationary states, and also the 
case of time-dependent perturbations. The method which we shall 
employ is known in mathematics as the ‘^method of the variation 
of constants.” 


As before, let us indicate by the eigenfunctions of the unper¬ 
turbed system, which are of the form 


Tff |U 

i'r = i 4 iq) e * 

(219) 

and satisfy the equation 


2iri ai 

(220) 
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Let us now introduce a perturbation, possibly also depending 
on the time, for which the Hamiltonian becomes 


5C«(g, p) + £>(q, p, t). 

Here, for the first time, we are confronted by a Hamiltonian depend¬ 
ing upon the time. We shall postulate that for this case the time- 
dependent Schrodinger equation also holds, in the same form as 
used thus far. The perturbed eigenfunction tp will therefore satisfy 
the equation 

+ (220') 


This yp may be expanded in terms of the orthogonal functions 
Then we have 

= ( 221 ) 

r 


where the coefficients Cr will in general be functions of t. Substi¬ 
tuting this expansion into (220') (and indicating, as we shall often 
do, the derivative with respect to time by a dot) we obtain, taking 
(220) into account,* 

r r 

From this the c may be obtained by multiplying both sides by 
^5* and integrating over all ^-space. We thus obtain 



r 


( 222 ) 


where, as in the preceding sections, the quantities 

Lsr = iyPTmdS (223) 

represent the elements of the ‘^perturbation matrixand may also 
be written, by virtue of (219), 


2irt 


Ltr Xm 




\,r = dS. 


with 


( 224 ) 

( 225 ) 
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Let us suppose that for i = 0 the state of the system is repre¬ 
sented by a certain ^(0) considered known, which when developed 
in a series in terms of the is 

'^(0) = ^ (226) 

r 

Comparing with (221) and noting that the are the values of the 
for ^ = 0, we see that the coefficients c? of the expansion (226) 
(which are considered to be known) represent the initial values of 
the Cr. These initial values are to be associated with the differential 
equations (222) in order to obtain the Cr as functions of or the 
evolution of the state of the system. In the absence of perturba¬ 
tions, the Cr would evidently maintain their initial values, and hence 
the state of the (unperturbed) system at time t would be given by 

rit) = 2 (227) 

r 

All these formulas are rigorous no matter what the nature of 
the perturbation. Let us now suppose that the perturbed state 
at time t differs little from the unperturbed state, or that ^(0 
differs little from that is, only by terms of the first order 

(hence this approximation will be valid for a time which is the 
longer, starting from ^ = 0, the weaker the perturbation). Then 
the Cr{i) differ from the c? by terms of the first order, so that in the 
right-hand member of (222) we may replace Cr(t) by c®, introducing 
an error of the second order. We then obtain for the d» the 
expressions 



where the prime indicates that we are dealing with a first approxi¬ 
mation. Integrating from 0 to i, we obtain the first-approximation 
values of the c,: 



These, when inserted into (221), give the first approximation for 
the perturbed state. 
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Then, substituting (228) into the right hand side of (222) and 
integrating between 0 and tj we can easily obtain the second-order 
approximation, and similarly for successive ones; here, however, 
we shall confine ourselves to the first. 

42. Transition probability. Let us now suppose that the per¬ 
turbation lasts only for a certain interval of time, from 0 to h, while 
for ^ < 0 and ^ we have £ = 0. We assume further that prior 
to the instant ^ = 0 the system is in the stationary state of 
energy that is, we assume that in (226) we have 

cl == 1, Cp® = 0 (forp 7^ n). (229) 


From the instant when the perturbation begins to act, this state 
is no longer stationary, and the \p of the system for the time from 
0 to h may be written in the form (221), where the Cr, with equa¬ 
tions (228) and the initial values (229) taken into consideration, are 
given in first approximation by 


c/it) ^ l*L,ndt {P9^n). 

From the instant h on, the coefficients Cn, Cp become constants; 
but they will have, rather than the values (229), the values obtained 
from the preceding formulas in which t was replaced by h. We 
shall call them cj, Cp^; that is, we shall put 


d1 L„„dt, (230) 

V = - ^ J\,„dt (p n). (2300 


This means that if, after the instant ti, we were to make a new 
determination of the state, there would exist a certain probability, 
given in first approximation by jcp^l^, of finding the system in the 
state ^p° rather than in the initial state If the observation of 
the state gives this result, we shall say that a transition from the 
nth to the pth state has occurred. The effect of the perturbation 
is therefore one of inducing a certain transition probability between 
one state (stationary in the absence of perturbation) and all the 
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others; and the amplitude of this probability for the transition 
n p is determined, as is shown by (230'), by the element Lpn of 
the perturbation matrix. The probability of the inverse transition 
p — > n is evidently the same, since L np Lpn . 

Note the difference between this result of quantum mechanics 
and the point of view of the old Bohr-Sommerfeld theory, which 
postulated that the system remain in the nth stationary state 
until it jumped into the pth state by a sudden process. Quantum 
mechanics assumes, in addition to the pure stationary states, the 
existence of states resulting from the superposition of different 
stationary states, and views the effect of the perturbation as a 
continuous process by which upon the nth pure state there are 
gradually superimposed varying “amounts^' of other states. The 
sudden transition into one of the latter will then happen only at 
the instant of observation, by the effect of the inevitable action 
of the instruments of observation on the observed system. 

43. Perturbation of the sinusoidal type. Resonance.^ Let us 
apply the results of the preceding section to the case in which the 
perturbing force is a sinusoidal function of time, of frequency v. 
This case occurs, for instance, when an atom is exposed to mono¬ 
chromatic radiation. 

Let us therefore suppose that the perturbing term of the Hamil¬ 
tonian is of the form 

£ = A(g, p) cos 2wpL 

The elements of the perturbation matrix are [see formulas (224) 
and (225)] 

L,r = A,r e * cos 2vyt, 

with A,r independent of time, or also, setting 

Ei -E? _ 
h ~ 

L.r = (231) 

The transition-probability amplitude from the state n to the 
state p, after a time ti of perturbing action, is given by (230'), which 

•The term ''resonance*' is used here in the classical sense. In quantum 
mechanics it also has another meaning, which will be illustrated in Chapter 15. 
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in that case yields 


-dpn r — 1 ^2Wf Ppn— pUi — 

2h L Vpn + V Vpn — V J 

L,iir(...+w,. sin^(»<>n+ »')<1 sin ir{v,„ - v)tr 

/i L Vpn + V Vpn V 


Now it is known that the expression ^ , considered as a 

a 

function of a, has an absolute maximum equal to tt/i, for a = 0, 
whereas the other maxima are considerably lower (the qualitative 
behavior is the same as that of the curve in Fig. 20). Therefore 
the two terms in the square brackets can take on appreciable values 
only for v = —Vpn (for the first term) and for v = Vpn (for the 
second term). The first of these cases may be realized if Ep^ < 
the second case if Ep^ > E^. In both cases we have, neglecting the 
other term, 


This means that the transition probability from the state n to the 
state p is appreciable only if the energy difference between the two 
states, |£Jp° — El\, is very close to the value /iv, or if \vpn\ is very 
close to V. In practice, the maximum is so sharp that the transition 
must be considered possible only if | Vpn\ = v (condition for resonance). 
This transition, if Ep^ > El^ requires the absorption of energy in the 
form of radiation. From this it follows that the atom in the state n 
can absorb only radiation of frequency v = {\/h){Ep^ — jE®), and 
absorbs it in quanta hv. Thus Bohr^s postulate concerning the rela¬ 
tion between energy and frequency is shown to be (for the case of 
absorption) a consequence of wave mechanics. 

If Ep^ < El^ the transition n—^p takes place with the emission 
of radiation. The theory discussed here does not furnish the ele¬ 
ments for calculating the frequency of this radiation; but the com¬ 
plete theory of the interaction of a radiation field and an atom due 
to Dirac (mentioned in §32 of Part II), makes it possible to show 
the validity of the Bohr postulate for this latter case as well. In 
addition, it shows the possibility of spontaneoustransitions 
(with emission) from the state n to a lower-lying one, whereas the 
transitions dealt with here are induced by external radiation. 



CHAPTER 14 


Relativity and Spin 

44. General considerations. Quantum mechanics as developed 
in the preceding chapters was constructed by starting from the 
classical (nonrelativistic) mechanics of a material point; as we 
know, it reduces to the classical case whenever a wave packet may 
be considered to be a point. However, it is known that classical 
mechanics, which holds for velocities which are small compared 
with the velocity of light c, constitutes only a first approximation 
of relativistic mechanics, which is valid for motions with any 
velocity. Hence we must remember that quantum mechanics as 
developed so far is valid only under the same limitations, and that 
in order to obtain a more rigorous and general quantum mechanics 
we must start from the relativistic mechanics of a point particle, 
rather than from classical mechanics. The necessity for this refine¬ 
ment becomes evident when we consider that the results of 
Schrodinger wave mechanics are not invariant under a Lorentz 
transformation. 

Another fact which was partly neglected in the preceding chap¬ 
ters is the existence of an intrinsic angular momentum {spin) and 
of a magnetic moment, both in the electron and in the proton, and 
presumably in other particles as well. This spin, as was pointed 
out in §25 of Part I and in §62 of Part II, is proved by many facts 
of spectroscopy and electromagnetism, and was first postulated by 
Uhlenbeck and Goudsmit under the name ^‘hypothesis of the 
spinning electron. 

At first, an attempt was made to deal separately with these two 
causes of inexactitude of quantum mechanics. Thus, extending the 
analogy to ordinary mechanics, many workers attempted a modifi¬ 
cation of the Schrodinger equation, in order to take the relativistic 
correction into account. This effort, which we shall discuss briefly 
in §46, was only partially successful. On the other hand, Pauli 
succeeded in introducing the spin hypothesis into (nonrelativistic) 

391 
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quantum mechanics, constructing a remarkable theory which will 
be presented in §45. But the most satisfactory solution of both 
these questions was found by Dirac, who showed that the two 
modifications—the one concerning relativity and the one concerning 
the spinning electron—are conceptually reduced to one and the 
same modification. This identity results because, when wave 
mechanics is given a suitable relativistic form, there follows the 
existence of the spin and of the magnetic moment, with their correct 
values and rules for spatial quantization, without the necessity of 
introducing them by an ad hoc hypothesis. From the Dirac theory 
we may then obtain the Pauli theory as a first nonrelativistic 
approximation. First, however, we shall deal with the Pauli theory, 
because it will furnish a simple example of certain new mathe¬ 
matical methods which are used on a much larger scale in the 
Dirac theory.^ 

46. Fundamentals of nonrelativistic spin theory (Pauli). We 

recall that the essential point in the hypothesis of Uhlenbeck and 
Goudsmit (from which are derived the spectroscopic consequences 
which have lent credence to the hypothesis) is that the projection 
of the spin upon any direction always has one of the two values 
±/i/47r, and correspondingly, the projection of the magnetic 
moment upon the same direction has the value ±/xo (where ^lo 
stands for the absolute value of the Bohr magneton). Let us 
therefore introduce three new observables a-*, <7y, <7*, representing the 
projections of the spin upon the three axes, measured in units of 
h/4tTr] and in order to reflect the Uhlenbeck-Goudsmit hypothesis, 
we shall assume that each of these has only the eigenvalues ± 1. 
We call them ^‘spin components'^ (implying ^‘in units of /i/4t"), 
and, according to common usage, we shall denote the operators 
corresponding to them by the same symbols. Furthermore, we 
shall introduce the components /x*, /xy, /x, of the magnetic moment, 

^ Essentially, the Dirac theory deals with the case of a single electron. 
Until quite recently a rigorous and completely relativistic theory of a system 
of several electrons was lacking. However, during 1948 and 1949 great progress 
was made in that direction through the contributions of Tomonaga and 
others in Japan, and Schwinger, Feynman, Dyson, and others in the United 
States. Following is a list of a few of the most important papers on this 
subject: J. Schwinger, Phya. Rev, 74, 1439 (1948); ibid., 76, 651 (1949); ibid,, 
76, 790 (1949); R. P. Feynman, Phya. Rev. 76, 749 (1949); ibid,, 76, 769 (1949); 
F. J. Dyson, Phya, Rev. 76, 486 (1949); ibid., 76, 1736 (1949). Further refer¬ 
ences may be found in some of these papers. 
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as three observables (or three operators) related to the q by the 
formulas 

M* = —MoO'x, ihi ~ “Atotry, iiz = —fJL&Jg, (233) 

Let us now investigate the properties of the operators o-x, ay, azy 
defined in this manner. First we note that since they have only 
the two eigenvalues ± 1, their squares will have the unique eigen¬ 
value + 1, so that it will be legitimate to write 

= o-y = O'* = L (234) 

Now it is necessary to establish the commutation rules of the 
operators <r*, ay, az. Pauli did this by assuming that the components 
of the spin behave in this respect like the components of an ordinary 
angular momentum, which, as we have shown in §30, satisfy the 
commutation relations (125). Since these relations hold for angular 
momenta in ordinary units, we must substitute {h/4.Tr)ax for SJtx, 
and so on. In this way we find the following commutation relations: 

ayaz — aza^ == 2iax, | 

azax <^3^z = 2iay, J* (235) 

ay ”” ayax “ 2zaz‘ / 

Let us now multiply the second relation by az (from the right), 
and the third one by ay\ upon adding them we get 

— ayax^y + <Tx{(J'y — O"^) = 2l(o'2/0'* + (TzCTy). 

Substituting into the first two terms the expression for ax obtained 
from the first relation (235), and recalling (234), we recognize that 
the whole left-hand side vanishes, and hence there remains 

ayaz I azay 0. 

Two other relations analogous to this one may be obtained in 
the same way. Therefore the components of the spin anticommute. 

* Incidentally, it follows that + <ry* + <r** = 3, and hence the total spin <r 
(in units of hl^ir), defined by turns out to be a/s and 

not 1 as is usually assumed in the vector model. This discrepancy is due to 
the inadequacy of this model, which was mentioned several times. In fact, 
we note that if in the theory of Uhlenbeck and Goudsmit we use the total spin 
instead of the projection of the spin on a given direction, we obtain formulas 
requiring slight modifications in order to agree with experience (see, for example, 
No. 27 of the Bibliography, page 198). 
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Upon using (235) again, we may write 

(Tz(Tx = —‘CxfTz 

(T y “^O'yO'x 

When we take the spin into consideration, a measurement of 
the coordinates Xj y, z of the electron is no longer a “maximum 
observation/^ since it does not yet completely define the state of 
the electron. In order to make it complete, it is necessary to add 
an observation of the projection of the spin (or of the magnetic 
moment, which amounts to the same thing) along any direction, 
which is usually chosen so as to coincide with the 0 -axis. The 
determination of the state is therefore complete with an observation 
of (note that we may not add an observation of ctx and of ay] 
the latter are incompatible with the former, because their respective 
operators do not commute). The state will therefore now be 
defined by a function ^(x, 2 /, Zj <r,, t) rather than only by ^(x, 0 , t). 

However, since the newly introduced variable (called spin variable) 
can have only the two values ± 1, it is often convenient to consider 
it as an index rather than as an ordinary variable, that is, to write 
}pi{x, y, 0 , t) and ^ 2 (x, y, 0 , t) instead of yp{x, y, 0 , +1, t) and ^{x, y, 0 , 
— 1, t). The introduction of the spin variable is therefore equiva¬ 
lent to introducing two functions ^ 1 , ^2 instead of only one 
this is the main feature of the Pauli theory (see §25 of Part II). 
The significance of these two functions is obviously as follows: 
\ypi\Hx dy dz represents the probability of finding the electron in the 
element of volume defined by x, t/, 0 , x + dx, 2 / + dy, 0 + dz, and 
with spin az = 1 (or else with p, = ““Mo), whereas i^ 2 |^dxdyd 0 is 
the probability of finding it in the same element of volume, but with 
O', = — 1 (or with pz = +Mo). Using the terminology of the vector 
model, we would say that in the first case the spin is parallel to the 
0 -axis, in the second case antiparallel to it. The privileged role 
given to the 0 -axis may of course just as well be given to the x- or 
y-axes, or to any other direction. It would then be necessary to use 
another pair of functions 

The normalization condition is evidently 


= i(rx ] 

= iay > (236) 

= i(Tf ] 


5nW+\H^)dxdydz - 1 . 


( 237 ) 
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It is sometimes convenient (in analogy to what was done in §6a) 
to consider the pair ipi, \p 2 as a matrix with a single column and two 
rows, which is indicated symbolically by that is, to put 




^2 


(238) 


In correspondence with this representation, we shall find it con¬ 
venient to represent the operators ax, ay, az by means of matrices, 
noting that they have only two eigenvalues and will therefore be 
represented by matrices with only two rows and two columns, of 
the form 

k\i ki2^ 

^21 ^‘^221 


where the k are constants which, in order to be Hermitian, must 
satisfy the (jondition k 2 i — /cf 2 - These operators, when applied to 
(238), replace the func^tions yj/i, \f /2 by two of their linear combina¬ 
tions, following the scheme (in accordance with the rule for matrix 
multiplication) 


/Cii 

k\2 



kliXpl + /vi2^2| 

k2\ 

k22 

1^2 


ktl'^l /v22l/'2| 


(239) 


(Since in this process ipi and ^2 behave as if they did not depend 
upon X, y, z, we shall say that such an operator ^‘operates only upon 
the spin variable/^) Let us proceed to the actual determination of 
these three matrices, which are denoted by the same symbols ax, 
ay, at as the operators they represent. We observe first of all that 
and ^ 2 , because of the meaning given them above, are simply the 
eigenfunctions of the operator at, corresponding to the eigenvalues 
+ 1 and —1, respectively; that is, we must have 


or else 


CTz^l = ^ 1 , 0 - 2^2 == —^ 2 , 





^ 2 ! 


— ^2 


From this we deduce, by virtue of (239), 

|1 01 


(240) 


( 241 ) 
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In order to determine the matrices ax and ayj let us first write them 
in the general Hermitian form 



ffli 

as 


bi 63 

(Tx - 

at 

U2 

y ay = 

bt &2 


and let us require each of these to anticommute with cr*. By (241), 
we have 



Ui 

—asl 


ax 

02 

az^z = 

at 

—^2!^ 

— ; 

-at 

—02 


and hence, since + azax = 0, we must have Ui = a 2 = 0. We 
then have 

2 dsaf 0 

^ x n * ’ 

0 


and in order for this to reduce to a unit matrix, we must have 
a^a* = 1, that is, a$ = with a real (and arbitrary). If we 
proceed in analogous fashion for <ry, it may be concluded that we 
must have 



0 

e<“ 

1 

0 

g0 

<Tx = 

g-ia 

0 

II 

e-if> 

0 


Then, imposing the condition axay + ayax = 0, we find between a 
and the relation or — /3 = t/2 + mr (with n any integer). The 
arbitrariness of n which remains in one of the two constants a, /?, 
has no physical consequences.® Taking a = 0, /? = —7r/2, we 
finally obtain 



0 

1 


0 —f 

ax = 

1 

0 

J ay = 

^ 0 


(241') 


These, together with (241), give the desired expressions. It is 
apparent that <r, is of the diagonal form, since because of the 
privileged role which we have assigned to the 2 :-axis, the matrices 
are referred to the ^V,-representation.” If some other representa¬ 
tion were adopted (and hence another meaning for rpi and ^ 2 ), the 
three matrices would transform as was explained in §8. 

By an obvious extension of the principles of §22, we may obtain 
the operator corresponding to any quantity relative to the spin by 
writing down the classical expression for this quantity as a function 

• This condition corresponds to the arbitrariness of the constant 0 in the 
argument of mentioned on page 155. 
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of the spin components and replacing them by their corresponding 
operators. For instance, the projection of the spin upon any 
direction ?i, with direction cosines a, jS, 7 , will be represented by 
the operator 

<Tn = OtCTx + fi(Ty + 7 ( 7 *, (242|^ 

or, in the <r*-representation, by the matrix 


Cfi 


! 7 a — 10 

\a + ^7? —7 


(242') 


which may be immediately obtained by substituting (241) and 
(241') into (242). 

As an application, let us suppose that an observation of the 
spin with respect to a certain direction n, with direction cosines a, 
0j 7 , has given the result + 1 , and that immediately afterward an 
observation of the spin is carried out with respect to the 2 -axis. 
What is the probability of finding a value + 1 , and what is the 
probability of finding — 1 ? The state resulting from the first 


observation will be defined by an expression 


^2 


such that (see § 22 ) 


= 1 


which is equivalent, when (242') is applied, to 
7ii + (a — 10)4^2 == 

(a + i0)4/i — 7^2 = >^2^ 

These two homogeneous equations (whose determinant is zero 
by virtue of a* + + 7 * = 1 ) yield 


4/1 ^ a — 10 

4/2 1—7’ 

and hence the ratio of the probabilities of the two results +1 and 
— 1 is 

~ — 1+7 

4^2 (1—7)^ 1—7 

If, in particular, the direction n were normal to the z-axis, the 
two probabilities would be equal. 
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It is to be noted that all these results cannot be interpreted by 
the simple vector model, according to which the second observation 
would certainly give the result cr* = y. 

The Hamiltonian of an electron having a spin and situated in a 
magnetic field may be written down by analogy to its expression in 
classical mechanics, which in first approximation^ may be obtained 
by simply adding to the Hamiltonian of an electron without spin 
(§31) the interaction energy between its magnetic moment and the 
magnetic field. That is, if we call mo the mass, —e the charge,® 
E the electric field, H the magnetic field, V the scalar potential 
and A the vector potential, and y, the magnetic moment, we shall 
obtain 




If pr and fXr are replaced by the corresponding operators 
'-ixofTr, this expression transforms into the operator 


h d 
2m dXr 


1 


where, extending the notation for the scalar product in an obvious 
manner, we have indicated by the symbol d • H the operator 
(or matrix) 

3 

d • H = 2 ^rHr = 

r-1 


+ iHy 


H, - iHy 

-Ih 


(245) 


The Hamiltonian operator § formed in this way then allows us 
to write the equation for ^ in the usual manner, namely, 

(246) 

and for stationary states 

(246') 

^ The rigorous formula would also contain terms of the order v/c compared 
with the others, representing the action of the electric field upon the magnetic 
moment in motion. 

^ In this whole chapter, we shall indicate by e the charge of the electron in 
absolute value, and by wo its rest mass. 
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Equation (246), as well as (246'), may formally be considered 
to be an equation in \l/(x, y, z, Gz, 0, or else, more explicitly, as a 
system of two equations in the two functions \pk{Xj y, 2 , i) (with 
k = 1, 2). For example, if we indicate by the part of the 
Hamiltonian (244) which does not operate on the spin, that is, if 
we write $ = • H, we may write (246) more explicitly, 

by means of (245), as two equations: 

^ 4 , + + Mo(^x - = - A ^>1 

$ 0^2 + HaiHx + iHy)\l/i — Hollzipi — — 


and similarly for (246'). If the magnetic field is zero or negligible, 
each of the two \p satisfies the ordinary Schrodinger equation. 
Hence, if we are dealing with a nondegenerate® stationary state, 
these two must differ only by a constant coefficient. 

Separation of the spin variable. If the magnetic field is uniform, 
the part of the Hamiltonian depending on the spin will not contain 
the position coordinates, and it is then possible (for a stationary 
state) to perform the separation" of the spin variable az from 
X, 2/, 2 , that is, to write \l/{x, y^ 2 , cr^, t) in the form ^n(x, y, z, t)(ps{az), 
where cr« = ± 1 and hence <Ps(a-z) represents, rather than a true 
function, the total of two constants ^«(1) = ^a( — l) = or the 


matrix 


Ois 


The index s is the spin quantum number which, as we 


shall see, may take on only two values, whereas n stands as usual 
for the group of three orbital quantum numbers. 

Indeed (246') will now split into the two expressions 


(247) 

jLtod • = E'/<ps (248) 

with E'n + E'J = E, The first of these is the ordinary Schrodinger 
equation. Hence J?' represents one of the energy levels of ordinary 
wave mechanics, and is an eigenfunction which corresponds to it. 
The second equation (where E^ represents the energy due to the 
action of the magnetic field upon the magnetic moment of the spin) 

• The degeneracy due to the spin is understood to be excluded. 
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may be more explicitly written as two algebraic equations 


fioH^as + fiQ(Hx — 1 

+ iHy)a^ — = Eg 0 ty j 


(249) 


which are linear and homogeneous in a, and Since the latter 
are not both zero, we must have 


Moi/* - e :' 

+ iffy) 


— iHy) 

- E'/ 


that is, = f4{Hl + HI + HI) = From this we get the 

two values of E'/^ corresponding to the two values 1 and 2 of the 
index s: 

F'/ = fxoH, F'' - -noH. (250) 


This result justifies the success of modelistic spin theory, in 
which it was postulated that the spin could line up parallel or anti¬ 
parallel to the field, the two values (250) of the magnetic energy 
corresponding to these two cases. 

If we now assume the magnetic field to be directed along the 
2 -axis, and if we solve the system (249) (determining the normaliza¬ 
tion constant such that + 1/8, = 1), we find 

for Ei'i oti = 1, /8i = 0, 

for E 2 » ^2 ~ 0, ^2 ~ 1* 


Hence we have, in the first case, ^n,i = 


, which means that 


the spin is certainly directed along the positive 2-axis, and in the 


second case ^n ,2 = 


, where the spin is directed with certainty 

^ n 


in the opposite direction. 

46 . The relativistic extension of the Schrodinger equation. 

Before exploring the Dirac theory, we shall show what equation 
for ^ we should obtain if, applying the principle of §22, we started 
out from the relativistic expression for the Hamiltonian (rather 
than from the classical expression, as was done in §19), and if we 
transformed it into an operator by means of the usual substitutions 
(S), (S') of §19.' 
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Let us consider a particle’ of charge e (in electrostatic units) and 
(rest) mass mo, placed in an electric and magnetic field, derived 
from a scalar potential V and a vector potential A. The relativistic 
Hamiltonian is** 


5C = 



(251) 


where the momenta pk are given by (see §31) 


moVk , € . 


The energy integral is 


3C - F == 0. 


(252) 

(253) 


However, since the first term is irrational in the pk, it is con¬ 
venient, before performing the substitutions (S), (aS'), to rationalize 
it by isolating the radical and squaring. Hence we start, instead 
of from (253), from the relation 

^ (*F ^ = 0. (254) 


Performing the substitutions (S), (S') and applying the operator 
obtained to ^ (denoting, as usual, the potential energy eF by f7), 
we get the following equation, which should represent the relativ¬ 
istic generalization of the Schrodinger equation: 



A.iY_yfA.A 

2wz dt/ Z-/ \27rz dXh 



— mlc^ 


^ 0 . 


(255) 


’ We denote by c the charge of a general particle, since in this chapter we 
shall reserve the letter e for the absolute value of the electronic charge. In the 
case of the electron, « « — e. 

®In fact, the kinetic energy is moc* ( ■ ■ -—-..ir::— — 1), the rest energy, 

V V 1 — ' 


TTloC* 


Wor*; the electrostatic energy, eV, The total energy is therefore ^_ 

V 1 — 

+ eF. In order to express the total energy in terms of the p*, note that we get 
^ (p* - ^ , from which ^ 


from (252): 


1 - 


vyc^ 


mo^c* + 



Substituting into the expression for the energy, we get (251). 



402 


GENERAL METHODS OF QUANTUM MECHANICS 


IS46 


First we shall verify that in the nonrelativistic limit, that is, 
when c may be considered sufficiently large compared with the 
other velocities involved, this equation leads back to the Schrodinger 
theory. However, we find that the ^ which occurs in (255) differs 
from the Schrodinger \f/ (which we denote by ^ for the moment) by 
a factor of modulus 1, of no importance in the calculation of p and i, 
which corresponds to the fact that in the energy there will also be 
included the rest energy nioc'^ of the electron. In fact, if we put 


and substitute this expression into (255), we have, by a simple 
calculation, 



h d 
'liri dt 




h d 

2^1 dXk 



0 . 


If we consider the first term to be negligible because of the 
factor 1/c^, we obtain the nonrelativistic approximation 




h 

2^1 dt^ 


an equation which coincides with (142') of §31, except for the 
difference in notation. 

However, it was recognized that for various reasons (255) could 
not in general be adopted as relativistic generalization of the 
Schrodinger equation; for example, according to (255) the integral 
of is varying in time, so that it may not be equated to 1. How¬ 
ever, these difficulties may be avoided in the special case of an 
electron not subject to forces (F = A = 0), in which case the 
equation becomes 


^ dt^ 


'~¥~~ 


^ = 0 


(256) 


and constitutes the starting point of the Dirac theory. 

47 . Principles of the Dirac theory. The fundamental idea which 
has led to the Dirac theory is as follows. Let us postulate, in 
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analogy to the Schrodinger theory and the Pauli theory, that the 
probability density P is calculated (as was pointed out in §25 of 
Part II) by means of a certain, at present undetermined number N of 
functions ^i, ^ 2 , . . . (in the Schrodinger theory N = 1, in the 
Pauli theory iV = 2; we shall see in what follows that in the Dirac 
theory we have to take iV' = 4), that is, by the relation 


P = Ml + Mt + Mt + • • • + (257) 


If each of the yj/ were contained only in a differential equation of 
the second order in the time, we should have to assign the initial 
values of ^ and 6 - 4 //dt in order to determine the further evolution 
of yp in time. Instead, it is obvious to postulate that the assign¬ 
ment of the initial value of yp (in all space) is sufficient to determine 
P at all later instants, and hence to assume that the N functions yp 
satisfy a system of N differential equations of the first order in the time. 
Since in any relativistic theory the variable t must be treated in the 
same manner as the spatial coordinates Xi, X 2 , Xz^ it follows that 
these equations must also be of the first order with respect to 
X\, X 2 , xz. Of course, by differentiation it is always possible to 
obtain second-order equations from equations of the first order— 
equations which will be necessary consequences of the first (but not 
conversely). We shall therefore require, in the particular case of 
no electromagnetic field, that the relativistic equation (256) be 
satisfied by each of the as a consequence of the first-order equa¬ 
tions which we are about to establish.^ At first we shall confine 
ourselves to the case of a free electron. 

The simplest hypothesis which might be made concerning the 
desired equations is that they be linear with constant coeflScients.^^ 
As we shall see, it is possible to satisfy all required conditions by 
this assumption. On the supposition that the equations may be 
solved for the time derivatives, the system may be written in the 
form 


1 I 

c dt 





d\p\ 

dXk 


+ 


mV 

h L( 

X 


/?mx^x = 0 


(258) 


® Analogously, in the electromagnetic theory of light, the wave equation 
(of the second order) is a consequence of the Maxwell equations (of the first 
order). 

This hypothesis may also be justified by the argument that no point of 
space-time should occupy a privileged position. 
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^where /x = 1 , 2, . . . iV), in which the coefficients are 

constants to be determined. (The reason for the factor 2Tri/h evi¬ 
dent in the last summation will appear subsequently.) Further¬ 
more, vve have introduced the convention, which will be retained 
in the following, of denoting by Greek letters X, . the indices 

(taking the values from 1 to N) which distinguish the various \p, 
and by ordinary letters the indices (= 1, 2 , 3) which distinguish 
the three space coordinates (which are designated indifferently by 
Xj y, z or Xi, X 2 , X 3 ). The writing is considerably simplified if we 
introduce matrix notation, denoting by a* and respectively, the 
four matrices with N rows and N columns, whose element of row^t 
and column X is and consider xp as the symbol of a matrix 

with N rows and a single column, as was done in §45 for N == 2. 
Then the N equations (258) may be condensed into the expression 


1 d\p 
cTi 


+ 


2 


. dip , 2 Ti 
or " 




+ -^^^ = 0, 


which, upon the introduction of the operators 


^ ^ h d 
2 m dxk 


= __ 1 

^ c 2m di 


is more conveniently written 

—p4 + ^ \p - 0, 


(259) 


(259') 


Before proceeding to the determination of the coefficients of 
(258), that is, of the four matrices a*, /9, we should obtain the 
expressions for the average electric charge density p and average 
electric current density j, which will be the generalizations of those 
already found for the nonrelativistic case. 

48. Electric charge density and electric current density. The 
probability density P is given by (257). Upon introducing the 
matrix \p and also the one-rowed matrix with N columns 

if = Iff, ff, . . . , ff I (260) 

(notation in accordance with that of §7), we may write (257) in the 
short form 


P == ff, 


( 261 ) 
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as may readily be verified by applying the rule for the product of 
two matrices. Consequently, the average electric charge density 
p = —gP will be given by 

p = (262) 

In order to find the average electric current density j, we observe 
that the latter must satisfy the equation of continuity^’ 

^ + div j = 0, (203) 

which, when the expression for p is introduced, becomes 
divj = + 

In the last equation, let us replace the derivatives of ^ and \p by 
their expressions oljtained from (259) and from its complex conju¬ 
gate. Denoting, as usual, by a^, ^ the matrices obtained from 
and 13 by interchanging rows and columns, and taking the complex 
conjugate of each element,we obtain 


1 ^ 
C dt 


2/ dxt h 


Substituting these derivatives into the expression for div j, we get 

div j = -ce ^ aV + lAa* 
k 

In order that the right-hand member actually have the form 
of a divergence, it suffices to impose upon the matrices the conditions 

= a*, p = /?, 

which imply that these matrices must be Hermitian. The formula 
then becomes 



Note that in order to preserve the validity of the rule for matrix multipli¬ 
cation, the matrix ^ is always written to the right of a*, /3, and to the left. 
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and we may take as components of j the expressions 

jk = —ce4'a^\py (264) 

or else explicitly, 

jk = —ce ^ (264') 

X.M 

49. The Dirac equations for the free electron. Let us now 

determine the coefficients of equations (258), or the matrices fi, 
imposing the condition that from these first-order equations there 
follows, as a necessary consequence, the relativistic equation (256) 
for each of the N functions [or that equation (256) be satisfied also 
when ^ is considered as a matrix]. For this purpose, let us apply 
to (259) the following differential operator, which is the only one 
for which the terms in d/dt disappear: 


cdt^ 



Taking into account the fact that the matrices a*, commute 
with the differentiation symbols, but are not in general to be con¬ 
sidered as commuting with each other, we obtain 








J,k k 

In order that this equation be identical with (256), we must have 


= 0, (for j 9 ^ k) 

(«*)» =={!), 

+ a*/? = 0, 


where {1} stands for the unit matrix with N rows and N columns. 
Introducing, instead of the matrix defined by 


/3 = moca^, 

we may condense the preceding formulas as follows: 


= 0 , 

= {IK 


(for X 9^ ft) 


(265) 


( 266 ) 
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where the indices X and fx, according to the convention already 
adopted, take the values 1, 2, ... N. We may also write, 
gathering (266) into a single formula, 


= 25x^(lj. (266') 

The problem is therefore to find four ITermitian matrices which 
satisfy these conditions. It may be shown that it is not possible 
to find such matrices with rank less than 4. However, it is possible 
to find them (in an infinite number of ways) with W > 4.^^ Since 
we want to limit the complication of the theory as much as possible, 
we shall assume N — 4; that is, we shall represent the state of an 
electron by four functions ^i, ^ 2 , ^ 3 , ^4 of x, t/, z, t, or else by the 
matrix 


liAil 



l^4| 


As far as the four matrices a are concerned, we may take the 
following, which, as may be verified, are Hermitian and satisfy 
(266'): 


a 


1 



0 

0 

0 

1 


0 

0 

0 

—i 


0 

0 

1 

0 

^2 — 

0 

0 

i 

0 


0 

1 

0 

0 

oc — 

0 

—i 

0 

0 


1 

0 

0 

0 


i 

0 

0 

0 



0 

0 

1 

0 


1 

0 

0 

0 


0 

0 

0 

-1 

^4 _ 

0 

1 

0 

0 


1 

0 

0 

0 

or = 

0 

0 

-1 

0 


0 

-1 

0 

0 


0 

0 

0 

-1 


(267) 


The fundamental equation (259'), which, when (265) is introduced 
for becomes 

o^pk — p4 + moca^j ^ = 0, (268) 


And precisely for all values of N that are multiples of 4. Such solutions, 
however, may be reduced to the solution for A » 4. 
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is equivalent, when we take expressions (267) for the a, to the 
following four equations (Dirac equations for the free electron): 


(—p4 + ^0C)^1 + (>)l — ^p2)^4 + p3^3 = 0, \ 

( —P4 + moC)l/'2 + (pi + fp2)^3 — p3^4 = 0, f 

(—p4 —• moc)\f/z + (pi — 'ip2)yp2 + pa^i = 0, ( 

(—p4 — moC)^4 + (pi + ^p2)^l — p3^2 = 0. / 

These may also be written, with the explicit expressions for the 
operators, 


0 


d , 2xmoc\ , , / d 

dt^-jrJ'f'^ + KdTr 



, ^3 


0 , 


and so on. 

The conditions (266') may also be satisfied by an infinity of 
other groups of four Hermitian matrices; there are thus as many 
different forms of the Dirac equations as there are different groups 
of four functions First of all, we see immediately that (266') 
is also satisfied by the matrices a'^ which are obtained from the 
defined by relations (267), by the transformation a'^ = Sa^S~^, 
where S is an arbitrary transformation matrix (provided that it is 
^^unitary^’)- Then (268) becomes ^ 

[ -pi + mocSa^S-^ + ^ = 0, 

k 

where the new quadruplet of functions ^ is indicated by Evi¬ 
dently this equation is satisfied upon taking = S\l/. Note that 
if we apply (262) and (264) and start from we obtain the same 
values for p and j as when starting from yp. Hence the solution 
under consideration is not physically different from the previous 
one. It may be shown that no other quadruplets of matrices 
(with i\r = 4) satisfy the desired conditions. 

50. The electron in an electromagnetic field. Now we have to 
extend the Dirac equations to the case of an electron situated in an 
electric and magnetic field, derivable from a scalar (electrostatic) 
potential V and from a vector potential A. The extension is 
analogous to that of nonrelativistic wave mechanics, in which the 
equation for a particle in a magnetic field [see formula (142'), §31] 
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differs from that in the absence of a field in having, instead 

of the operators ”■ ^he operators 

2ti dqr 27 ^^ dt ^ 27 ^^ dqr c 

h d 

“ 7r~’ Recalling that now the charge of the electron is 

Zti ot 

indicated by — e and that U — —eF, we are led to replace, in (268), 
the operators pjt and p 4 , respectively, by 


crj ^ ^ A 


^4 = 


^ i. ^ I ^ Y 
Zwi c dt c * 


(270) 


This substitution is suggested not only by the said analogy, but 
also by the necessity of preserving the relativistic invariance of the 
equation (in fact, in a Lorentz transformation Ak and V transform 
in the same way as the operators d/dxkj d/dt) as well as its ‘^gauge 
invariance, which will be explained in §57. The Dirac equation 
then becomes 

-^4 + moca'‘]iA = 0, (271) 

which, assuming expressions (267) for the a, may be translated 
into the four equations 


(—^4 + ^ 0 C )^1 + (*^1 — t ^ 2 ) + ^ 3^3 = 0 , 

(—^4 + moC)^2 + + i^2) — ( /279^ 

(-^4 - moc)^3 + - m + ^3\^1 = 0, 

( — ^4 — moC)\l/i + (^1 + ^’^ 2 ) “* ^d'^2 == 0 . ^ 

Of course, the justification of this postulate lies in the conse¬ 
quences which are deduced from it; in particular, in the fact that 
the results of these equations coincide, in first approximation, with 
the results of the Pauli theory and, if the spin is neglected, with the 
results of the Schrodinger theory; they also strictly satisfy the 
relativity principle (as distinct from the other mentioned theories). 
All this will be seen in the following sections. It is now to be noted 
that if equation (271) is solved for the time derivative contained 
in ^ 4 , it may be written in the usual form: 


- 


h d^p 

2irz dt^ 


( 273 ) 
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if Ave put 


$ = c ^ - eF + 


rYiifi-cr 


(274) 


This operator may thus be considered to be the Hamiltonian 
operator of the Dirac theory. It is apparent that from (273) we 
should obtain formula (118) for the derivative of an observable by 
the same procedure as used in §28. This relation may therefore 
be taken to hold also in the Dirac theory, provided that we under¬ 
stand by ^ the operator (274). 

Finally, we note that for a stationary state of energy^® W we 
have 


and the matrix composed of the four u satisfies (just as does ^ in 
this case) the equation obtained from (271) when replacing ‘i|34 
by {l/c){W + cF). 

61, The Pauli theory as a first approximation. We are now 

going to show how the Dirac equations reduce to the equations of 
the Pauli theory when terms in 1/c^ are neglected. For this purpose, 
it is convenient to introduce, instead of the four two pairs of func¬ 
tions <pi, <P 2 , and Xh X 2 , related to the ^ by the following expressions: 


Applying 

formulas 




(pi e 


2irt 

- —mncH 

n 


-rnocH 


2Tri 

— —mncH 

i/2 = (P2e ^ , 

2irt 

--moc*f 

= X2e ^ 


(275) 


i's = Xie ^ 

the operators ^ defined in §50, we immediately find the 





<Pl 






i/2 

^8 

= e 

<p2 

Xl 

, ?4 

i/2 

i'B 

2wi 

- r-mocH . 

= e '> (^4 + moc) 

(P2 

Xl 


^4 


X2 


^4 


X 2 


Taking the last of these into account, we see that in the first 
two equations of (272) the term nioc disappears from the first 


By W we indicate the total energy; that is, we include the rest energy, 
required by relativity theory, equal to moc*. Hence between W and the E 
used thus far the relation W E -j- moc* holds. 



RELATIVITY AND SPIN 


411 


§ 51 ] 

parenthesis, while the same term is doubled in the other two. In 
fact, the equations become 

— ^4^1 + (‘iPi — i^2)x2 + ‘iPsXi = 0, \ 

— ^4^2 + (^1 + ^‘p2)xi — ??3X2 = 0, / (o^C\ 

- 2moc)xi + (?1 - i^ 2)^2 + = 0, ( 

( — ^4 — 2moc)x2 + + i‘^2)<Pi — ‘iP3<;^’2 = 0. ) 


Let us introduce the two-rowed matrices ^ = 


X = 


and 


in addition, the three matrices (with two rows and two columns) 
O'®, (Tyy (Tzy defined in §45, which for convenience we shall now desig¬ 
nate by (Ti, <72, <Td. Then the four preceding equations may be 
assembled into the formulas 


^ '^k(TkX, 
k 

(277) 

($4 + 2 moc)x = y ’^k<rk<f>. 

(278) 


In ordinary cases (that is, those corresponding, in the classical 
model, to particles having velocities small compared with c, so 
that nonrelativistic mechanics may be used), 'iP4X turns out to be 
negligible compared with 2moCx, so that we obtain from (278) the 
following approximate expression for x* 


X = 


2moC 




(278') 


from which it may be seen that the x are ordinarily small compared 
with the ip (or \pz and tp 4 are small compared with \pi and ^2). Sub¬ 
stituting this expression of x into (277), we have 

- W X <277') 

k,l 

In this double summation, the six terms for which k 9 ^ I may be 
united two by two in the following manner. For instance, let us 
consider the two terms ($2^80'2(7'3 + ^8?J2a'3cr2). By virtue of (236), 
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As for the three terms in which k — I, they give, if we keep 
(234) in mind, 

I 

k 

Thus finally (277') becomes 

or also, when we write ^4 explicitly and recall the expression of the 
Bohr magneton /xo = 4 ^^^^ 


- a Tr-[i '‘•'•“I"' 

k 

The square-bracket expression on the right side of this equation 
may be identified with the operator $ of (244), and hence this 
equation is identical with that in the Pauli theory. 

From this we see to what extent the Dirac equations imply, at 
least in first approximation, the existence of a magnetic moment 
equal to mo directed in the sense opposite to the spin. This conse¬ 
quence will be seen in a different manner in the following section. 

62. Magnetic moment of the electron. Let us now show that 
the Dirac electron behaves (in first approximation) as if it had a 
magnetic moment mo, not only in regard to the energy levels, as 
seen in the last section, but also inasmuch as it generates an average 
magnetic field^^ equal to the field produced by a magnetic dipole 
In the sense explained in §27. 
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of moment ^lo. For simplicity we shall confine ourselves here to 
the case of an electron at rest, possibly located in an electric field 
(but not in a magnetic field of external origin) having its spin 
parallel to the z-axis, so that in the Pauli approximation ^2 = ^2 = 0. 

First let us calculate the components of the average current 
density j (from whii^h we must then ol>tain the average magnetic 
field generated by the el(Md.ron) by means of equation (204), intro¬ 
ducing the ip and x defined in the previous section by (275). Using 
(267), we get 

= -cc(<^fx2 + X2<Pl), 
ji = -tCC(~ip'tX2 + X2<Pl), 
jz = —cc(^fxi + XiVl). 

Ivct us now introduce the approximate expression (278') for 
the X. In the actual case, if we make use of (241) and (241'), this 
becomes 

_ h dz 

^ 47rmoC d<pi . dip} 

In addition, we recall that <pi satisfies the ordinary Schrodinger 
equation (see page 154) and that, for a stationary state of energy 
the Schrodinger eejuation always has a solution of the form ipi = 

u e ^ , with u real (see page 162). We then obtain 

d'li^ . ^ 

Jl = -C/XO J2 = ego Jz = 0. 

Upon introduction of the vector I with components 

/i = 0, 72 = 0, Iz ~ (280) 

these formulas may be assembled into the vectorial expression 

j = c curl L (281) 

For a more general treatment, see, for instance, No. 26 of the Bibliography, 
Chapters XIII and XIV. With the more rigorous theory we find [see G. Breit, 
Nature 122, 649 (1928)] that an electron bound to a nucleus of charge Ze has, 
in the ground state, a magnetic moment }^(1 -}- 2 I — or^Z^) go, where ot is 
the fine-structure constant. 
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Hence the current density does not turn out to be zero, as it 
would have been in the Schrodinger theory for a ^ of the form con¬ 
sidered here. However, the total current across any infinite plane 
vanishes as may immediately be verified. This statement may be 
interpreted in terms of a model by saying that the electric charge 
density cloud,” which is equivalent to the electron on the average, 
does not displace itself as a whole but contains internal electric 
currents, whose magnetic effects we shall now investigate. 

The vector potential, from which the magnetic field is derived, 
is obtained from the current density j by the well-known formula 
from electromagnetic theory 

A = - / i rfS. (282) 

^ c Js r 

On the other hand, it may be shown'® that a magnetized body 
whoso intensity of magnetization is I, gives rise to a magnetic field 
whose vector potential is given })y 

A _ /■ is, 

Js r 

By comparing these two expressions and taking (281) into 
account, we may conclude that the magnetic effect of the currents 
in question is the same as would be produced by an intensity of 
magnetization of space represented by the vector I, defined by 
(280). In other words, the cloud” of electric charge density 
behaves as if it were magnetized, in the direction of the negative 
z-axis (that is, in the sense opposite to the angular momentum) 
with an intensity The total magnetic moment which is 

obtained by integrating this intensity over all space, proves to be fio, 
by virtue of the normalization of u. 

If we had supposed instead that the spin was antiparallel to the 

z-axis, that is, = 0 and <p2 ^ ue ^ , we should have obtained 

an analogous result, but the magnetic moment would have been 
directed along the positive z-axis, that is, opposite to the spin in 
that case also. 

See, for instance, No. 26 of the Bibliography, page 173. 
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In the general case we find that the equivalent magnetization 
is given to the same approximation by 

(h = + V'2Vl), ) 

Ih - ^>o(V'lV2 - tAsVi), [ (283) 

[h = - ^2^2). J 

We might suppose that we could detect the magnetic moment of the 
electron experimentally (as is done for atoms) by means of an experiment 
of the Stern-Gerlach type. This is not possible because, no matter what 
the experimental arrangement, the spreading of the electron beam due to 
diffraction is of the same order of magnitude as the deviation due to the 
magnetic forces we are trying to detect.In fact, liohr has pointed out 
that to the order of approximation in which wave mechanics coincides 
with point mechanics, the spin and the magnetic moment of the electron 
(which both occur in terms containing h) disappear; hence no experiment 
in which the corpuscular model may be applied to the electron permits the 
detection of these moments. 

However, it is not impossible to imagine experiments of the non- 
corpuscular type which would allow us to prove the existence of the spin. 
For example, a beam of electrons diffracted by a crystal in a direction 
nearly perpendicular to the incident beam should presumably be ^‘polar¬ 
ized’^; that is to say, if it is subsequently diffracted by another crystal, it 
should exhibit different degrees of diffraction in the various planes passing 
through its direction of propagation. (In the vector model, this change 
might be interpreted by imagining that after the first diffraction the spins 
of the electrons are no longer oriented at random but with a preference for 
one of these planes.) This experiment, performed in different ways by 
various investigators, has given results which are still open to question. 

63. The spin in the Dirac equations. In order to show that 
(271) implicitly contains the existence of spin and its property of 
quantization (which also results from §51, but only to a first 
approximation), let us consider the case in which the force acting 
upon the electron has zero moment with respect to the z-axis, such 
as for a central field. In this case we have shown (see §30) that 
in the Schrodinger theory the angular momentum about the z-axis 
remains constant, or that 

= xpy - 2/p* (284) 

is a first integral, as in ordinary mechanics. We shall now show 
that in the Dirac theory this is no longer so, and that instead of il/«, 
the observable AT* == ilf* + Sg will be a constant, where Sg is an 

For the proof, see N. F. Mott, Proc. Roy. Soc. A124, 425 (1929). 
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observable whose eigenvalues are ±h/Air, This result signifies that 
the total angular momentum with respect to the 2 ;-axis is not 
but Nz, and that the term Sz represents an intrinsic angular momen¬ 
tum of the electron, whose projection upon the 2 ;-axis is always 
i h/ 47r. 

In order to realize the above-mentioned conditions, let us sup¬ 
pose that A = 0 (no magnetic field) and that V is symmetric about 
the z-axis; that is if we adopt polar coordinates and if is the 
'Tongitude,^^ V is to be independent of <p. Let us now calculate 
the derivative of Mz with respect to using (118) of §28. We get 
(ignoring the distinction between the symbol for an observable and 
the symbol of its operator) 

M. = ^ (KM, - MX). (285) 

It is now apparent that in the present case the Hamiltonian 
reduces to [see equation (274)] 

3C == c ^ o^pk — cF + rrioc^a^, (286) 

k 

and that Mz, by (284), commutes with a^, a^, a^, a^, with pz, and 
also with V (since, as we have seen in §30, Mz — {h/27ri){d/d(p) in 
polar coordinates, and V is independent of ^). Hence in (285) 
there remains only the contribution of the first two terms in the 
summation of (286): 

• 2'7r7 

Mz = [ca^iPzMz — MzPx) + ca^iPyMz — MzPy)]. 

But from the commutation relations and from (284) we obtain 
PxMz MzPx — PyMz MzPy ~ 2W 

hence Mz = c{a^py — (287) 

As we see, this derivative is not identically zero, which means that 
Mz is not a first integral. 

Let us now consider the observable Sz whose operator is 
Sz = 2 ”- 


(288) 
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and let us calculate its derivative by the formula used above. 
We have 

Substituting for 5C the expression (286), we note that com¬ 
mutes with the p and with F, and that furthermore we have, as an 
immediate consequence of (266), 


q;1(q,1q,2) __ (a^a^)a^ = 2a^, 

a:^(a^a^) — (a^a^)a^ = 

a^(a^a^) — (a^a^)a^ == 0 , 

a^(a^a^) — (a^a^)a^ = 0 . 

Thus we find 

= c(a^px “ oL^Pv)- 


Comparing this expression with (287), we find that 

Mg + a§2 = 0,] 

that is, that the observable Ng = Mg + /S* is a first integral, as 
was claimed. Similar reasoning may be applied for the y and z 
components. We conclude that the projections of the spin upon 
the three axes are represented by the operators 


Sx = 

2Tt 


Sy 


2ti 


Sg = ~ 
Zti 


or, adopting the matrices (267) and carrying out the products, by 



0 

1 

0 

0 


0 

—i 

0 

0 


1 

0 

0 

0 

O' ^ 

i 

0 

0 

0 

*■-3; 

0 

0 

0 

1 


0 

0 

0 

—i 


0 

0 

1 

0 


b 

0 

i 

0 


10 0 0 

, ^ ^0 ~1 0 0 

'*4x0 0 1 0 

0 0 0 -1 


(289) 


It is to be noted first of all that these mat^rices are Hermitian, 
and hence the observables which they represent are real.^® Further- 

The particular choice adopted for the matrices makes Sg diagonal. This 
means that the operators are in the ^‘^^representation.” By a transformation 
of the type described on page 408, we could pass to a representation in which 
Sy or Sx would be diagonal. 
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more, each of these has the eigenvalues ±/i/47r. In fact, it may 
readily be verified that the three matrices written above have their 
squares eijual to the unit matrix, and hence that aSJ, aS| have the 
single eigenvalue {h/Awy. It is also apparent that Sx^ Syj Ss do 
not commute with one another.Instead, they satisfy the same 
commutation relations which were found in §30 for ordinary angular 
momenta. 

Of course, all the considerations of Chapter 11 for operators 
corresponding to other observables apply to the spin operators. 
The only difference is that the latter operate upon functions of a 
variable with only four values, namely, the index X of the four 
2 /> ^)* (Since they do not involve the other variables x, y, 
we are dealing with ^Tncomplete^' operators.) These operators are 
therefore simply linear transformations of groups of four numbers. 
Correspondingly, each of these has only four eigenvalues and four 
eigenfunctions (in fact, the four eigenvalues reduce to two double 
eigenvalues). 

In order to illustrate these matters by an example, let us seek 
the eigenvalues and eigenfunctions of the operator Sg defined by 
(288) or by the last expression of (289). We indicate by (ju = 1, 
2, 3, 4) a general one of the four eigenfunctions (which will reduce 
to a set of four numbers (P 2 ^y (pz^y ^ 4 ^), and by the correspond¬ 
ing eigenvalue. We must then have 


= S^<P^y 

or, in explicit form, 



1 

0 

0 

0 









<P1^ 

h 

0 

-1 

0 

0 






h 




4ir 

0 

0 

1 

0 



= 


7 

or: ^ 

47r 


= 

V?3^ 


0 

0 

0 

-1 


^4^ 









which is equivalent to the four equations 

('' + s) 

Hence the three spin components are not compatible observables. It 
is because of this fact that the properties of the spin do not correspond at all 
to those of an ordinary gyroscope. 
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Since we have to exclude the solution ipi^ == <^2^ = == 0, 

we must have either s = — /i/4x or s = +/i/4x. Since each of these 
eigenvalues is double, as we shall see shortly, we shall count them 
twice and put 



In the first case (/n = 1, 3), the equations yield (^2^ = = 0,. 

whereas and <^3^ remain arbitrary (except for orthonormality) and 
may be taken equal to 1 and 0, respectively (for ju = 1), or to 0 and 1 
(for /X = 3). Similarly, in the second case (/x = 2, 4), we have 
= (fz^ = 0, and <^2^, may be taken equal to (1, 0) and (0, 1), 
respectively. Thus we obtain the four eigenfunctions of the oper¬ 
ator aS*: 



1 


0 


0 


0 


0 

0 

1 


0 

A 

0 

<p^ = 

0 

A = 

0 

, <P^ = 

1 

y <P^ = 

0 


0 


0 


0 


1 


We shall now make use of these in order to solve the following 
problem. Given the find the probability that an observation 
of the spin with respect to the ^-axis yields the result +/i/4x 
(or —h/^Tv), Let us apply the procedure of the note in §22 literally; 
that is, let us develop the matrix ^ in terms of the eigenfunctions ip 
(the series reduces to four terms; let us indicate by c\ the coeflBcients 
designated by in §22). By this method we obtain 

* ^ = Ciip^ + + Czip^ + C^ip^. 

It may be readily verified, by use of (290), that the four coeffi¬ 
cients c are identical with ^1, ^2, ^z, ^4, respectively. Applying 
(97') of §22, we find that the probability corresponding to the 
eigenvalue sMs Pi = dx d?/ dz, and similarly the eigenvalue 

corresponding to s® is P3 = dx dy dz. Since the two eigen¬ 

values coincide, the total probability for the double eigenvalue 
+h/ATr is 

+ l^3p) dx dy dz. (291) 

Similarly, we should find that the probability for the double 
eigenvalue —/i/4x is 

P- = ///(l^2p + 1^4!^) dx dy dz. 


(291') 
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We may therefore say that of the four the two with odd index 
correspond (in the vec^tor model) to the spin oriented parallel to 
the 2 :-axis, and the two with even index to a spin antiparallel to the 
£-*axis. In the nonrelativistic approximation, as we have seen, \p 3 
and 1/^4 may be neglected with respect to and ^ 2 , and the formulas 
(291) and (291') reduce to those which define the two ^-functions 
of Pauli. 

64. Plane waves. As an example of the rigorous solution of the 
Dirac equations, let us examine the special case of an electron not 
subject to forces and having an exactly determined momentum p. 
Calling W the total energy (including the rest energy), we shall 
try to satisfy the equations (2G9) by a solution which is analogous 
to the one of the corresponding case in the Schrodinger theory 
(see §44), namely, by plane waves of frequency W/h and with 
propagation vector p/h. Let us therefore put 

= a, = 1 , 2 , 3, 4). (292) 

Substituting into (269), we obtain 


(- 

(- 

(- 

(- 


— + m^cj ai + (pi — fp2)a4 + Psas = 0, 

W \ 

— + moc ) a2 + (pi + iV2)a^ ~ pm = 0, | 

W \ j 

— -moc jaz + (pi ~ fp2)a2 + Psfli = 0, | 

) ♦ 
^4 + (pi + fp2)ai — Pz(l2 = 0. 


(293) 


These four linear homogeneous equations in the four constants 
ai, a 2 , as, a 4 have nonzero solutions only if 

W 

— — + moC 0 Ps Pi — ipi 

W 

0 — — + moC Pi + ipt —Pj 

W =0. (294) 

Pz Pi — ip2 —-— moc 0 

W 

pi + ip2 —pa 0 - nioc 

c 
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Developing the determinant, we find that it is equal to 


(Wl 

\c^ 


'*oc2 



where we have put = pi + pi + pi TIence the condition that 
the determinant vanish is 


— = ± + p‘^- (295) 


This coincides with the relation given by relativistic mechanics 
between the energy W and the momentum p. If we take the plus 
sign and expand the radical in a series, we may write, neglecting the 
higher powers of p, 

W = moc^ (2950 


This expression for the energy reduces, for small velocities, to 
the kinetic energy of ordinary mechanics, increased by the rest 
energy rrioc^. If we take the minus sign instead, we obtain a value 
for W close to — moc^; this value has no analogue in ordinary 
mechanics. We shall return to these anomalous (negative) values 
of the kinetic energy in §60. 

Let us now proceed to a determination of the a. Since, as may 
be verified, equation (295) makes all third-order minors of the 
determinant vanish, one may fix two of the four a arbitrarily and 
solve for the other two. Hence relations (293) will have two 
linearly independent solutions, corresponding to the two possible 
orientations of the spin. We shall take for fundamental solutions, 
in the case W > 0, 


(I) 0i - A, tti 0, 03 A ^ ^^^2, 

(II) 01 = 0, 03 = 03 = ^ 


(X\ 


CLi 


A c(pi + ipi) \ 

W + moC^’ I 
A -gPa ■ ( 

W + moc^’ I 

(296) 


the modulus of the constant A is determined from the normalization 
condition (see §10 of Part II). 

It is to be noted that the denominators of (296) are of the order 
of 2 moC*; hence, if p « wioc, as it is ordinarily, 03 and 04 prove to be 
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small compared with A, as was already seen in general in §51. To 
the order of approximation in which and |a 4 l^ are considered 
negligible compared with \A\^j solution (I) corresponds to the case 
in which the spin is parallel to the 2 -axis, whereas solution (II) 
corresponds to the ca^se of antiparallel spin. The most general 
solution, which is obtained by combining (I) and (II) linearly, 
represents the cases in which the spin along the 2 -axis does not 
have a definite value. 

Proceeding now to the case of IP <0, we shall find it convenient 
to take as fundamental solutions 


(I) ai = B 

(II) ai = 5 


cyz 
W — 

c(pi - m) 

W - moc2' 


a2 — B 



a2 = B 


—cps _ 

W — moc^^ 


as = By 04 == 0, 

Os = 0, 04 = B, 


(297) 

We get |B| from the normalization condition. 

In this case Oi and 02 are small compared with B (supposing 
that p « moc). If Oi and 02 are considered negligible, solution (I) 
corresponds to the spin parallel to the 2 -axis; solution (II), to anti¬ 
parallel spin. 

55. Other form of the Dirac equations. We shall now put the 
Dirac equation into a different form which, since the variable t is 
treated symmetrically with x, y, 2 , lends itself especially to con¬ 
siderations of a relativistic nature. 

Let us multiply the Dirac equation (271) by (from the 

left), and let us put {k = 1, 2, 3): 


= 7 *, = 7 ^, (298) 

Pfc = Efc, £P 4 - n 4 . (299) 

This arrangement permits us to collect even the term with 
index 4 into a single summation, and the equation may thereby be 
written in the form^® 

4 

^ 7^11;, — zmoc j ^ == 0. (300) 

We recall that in this whole chapter we indicate by Greek letters the 
indices taking the values 1, 2, 3, 4, and by ordinary letters those assuming 
only the values 1, 2, 3. 
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The matrices 7 ^, defined by (298), are Hermitian, like the a^. As 
may readily be seen, they also satisfy the relations 

7^7" + 7 ^ 7 " = 25xm (X, M = 1, 2, 3, 4). (301) 

The operators may be written in a more symmetric form by 
introducing, instead of ty the variable 


X4 = icty 

and considering Ai, A 2 , A 3 , iV as the four components of a ^Tour- 
vector'' ^ in the space of the variables xi, x^y Xzy X 4 . (Minkowski 
space). In fact, it is known from the theory of relativity that 
under a Lorentz transformation these four (quantities transform just 
like the components of an invariant four-vector. Therefore we put 

Ak = iV = <^4, (302) 

and then (299) may be assembled into the single expression 


n. 


h d e 

2 iri dXfi c ^ 


(299') 


Let us now deal with the expressions for the (average) electric 
charge density p and the (average) current density j. It is con¬ 
venient to introduce 




(303) 


from which, by multiplying by — from the right, 

yp = -z^+7^ (303') 

The expressions (262) and (264) then become, respectively, 

p == (304) 

jk = — ce^+7^^, (305) 

which show that the four quantities 

Jk = jk, t /4 = icp (306) 

(which, we see, also constitute the components of an invariant 
four-vector, namely, the ^Tour-current") are expressed in a uniform 
manner by means of the yp and \p'^, since (304) and (305) may be 
combined into the equation 

= —ceyp'^yhp. 


( 307 ) 
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Note that the equation of continuity (263), in the notation (306), 
takes on the form 


X 


dXfi 


= 0 , 


(308) 


which is relativistically invariant. 

It may be useful to write down the equation which satisfies. 
From the Dirac equation (271) we obtain, by taking its complex 
conjugate (and noting that the conjugate of is if a is a 
Hermitian matrix), 

^ - PU + rriooPa* = 0. (309) 


Inserting (303') and taking (298) into account, we have 

^ Pk'l'^y'‘ + iPt4^y* - imoc4>+ = 0. 


(309') 


In order to collect the first two terms under a single summation 
sign, it is convenient to define the operators 


Ut = n? = Pt, ni- = -nr = iP*- 

(310) 

that is, 


27rz dXfi c 

(310') 

Thus equation 309' becomes 


y = 0. 

(311) 


M 


66 . Relativistic invariance of the Dirac equations. The relativ¬ 
istic invariance of the Dirac equations is to be understood in the 
following sense. We consider a second system of reference in uni¬ 
form and rectilinear translation with respect to the first; that is, 
we pass, by a Lorentz transformation, from the variables Xz, x^, 
Xi == icty to x[, X2, Xg, X4 = ict\ We shall show that in the new 
reference system, the Dirac equation (300) holds in the same form 
as in the old frame; that is, it may be written 



— irrioc 



= 0 . 


(312) 
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This does not mean that the obtained from this equation is equal 
to but that the quantities p', j' (electric charge and current 
densities in the second system of reference) obtained from by 
formulas analogous to (262) and (264) are equal to those obtained 
from p and j when the latter are transformed according to the laws 
by which the electric charge and current densities transform in 
ordinary electromagnetic theory. That is to say, they transform 
in such a way that the four quantities Jk = J a = iep constitute 
the components of an invariant four-vector in Minkowski space. 

In order to prove what we have stated, let us consider the most 
general Lorentz transformation, or the most general orthogonal 
transformation in Minkowski space, expressed by the relations 

Xfi (^ixyXpy (313) 

V 

where the coefficients a^y are connected by the relations 

CLftpdfiff ~ 5p(y, (314) 

It is now to be remembered that these coefficients are real, 
except for those which contain the index 4 only once; the latter 
are pure imaginary (such as X 4 and 0 : 4 ). The operators transform 
like the components of a four-vector [as may readily be recognized 
from (299')], that is, according to the relations 


n/ = ^ apyWp. 

(315) 

V 

Hence (312) may be written 


— irrioc^ = 0. 

(310) 




Let us now attempt to satisfy this equation by assuming, pending 
a posteriori verification, that the ^ transforms linearly, analogous 
to the behavior of the components of the electric and magnetic fields 
under a Lorentz transformation. Therefore let us put 

fx' = ^ (317) 

leaving the coefficients S\p undetermined for the present. In short. 
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we shall write, denoting by S the matrix whose elements are 

= S^p, (318) 

Substituting into (316) and multiplying by from the left, 
we obtain (noting that S commutes with the 11 but not with the 7 ) 

— zmocj ^ == 0. 

This equation will be satisfied if the matrix S is such that 

^ a^,S-^y>‘S = 7 % (319) 

M 

since in that case the equation is identical with (300), which by 
hypothesis is satisfied by Hence we have to prove that for any 
Lorentz transformation there exists a matrix S satisfying (319). 
In order to show this, let us first consider an ^^infinitesimar^ Lorentz 
transformation; that is, let us put 

i (320) 

where the are infinitesimals of the first order. Equation (314) 
becomes the condition for antisymmetry 

Cptr + €ffp = 0. (321) 

It is obvious in this case to seek for S a form which is infinitely close 
to the unit matrix, that is, to put S^y = 5p„ + T^y, where the T^y are 
infinitesimals, or 

>S = { 1 } + T, (322) 

Then, neglecting infinitesimals of higher order, we have 

- {1} - r, (323) 

and (319) assumes the form 

^ = Ty” - y'T. (324) 

Now we may easily verify, keeping (301) in mind, that (324) is 
satisfied by taking 


T = i ^6^7^, 
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or by taking 

s = { 1 } + i y 

(325) 


fX,\ 


Having thus demonstrated the existence of the matrix S for 
an infinitesimal transformation, it follows immediately (since, as 
we know, the Lorentz transformations form a group) that such a 
matrix exists for any Lorentz transformation and may be con¬ 
structed as the product of an infinity of matrices of the type (325). 
Let us now show that once we have constructed S in this manner, 
transforms according to the law 

= ^+s-K (326) 

For this purpose we observe that is defined by a formula 
analogous to (303), that is, by 

(327) 

We may transform (327), noting that we have, from (318), 

(328) 

where we introduce the matrix S defined, as usual, by Sf^x == 

If we substitute (328) into (327) and keep (303') in mind, the 
formula defining becomes 

^ ,i,+y^Sy\ (329) 

Thus (326) will be proved if we show that 5, under any Lorentz 
transformation, possesses the property 

= S-\ (330) 

and it will be suflBcient to show that this formula holds for an S of 
the form (325), since we may verify immediately that if it holds 
for two matrices Si and > 82 , it will also hold for their product. To 
prove (330), we note that we obtain from (325): 

S = {1} + i ^ 

= {1} + i y €wkTV+i y (€4 *tv+€?47^7^). 

O k 

From this, recalling that for Hermitian matrices such as 7 one has 
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AB = BA = BA, and that in addition e*^ = thk, «** = — * 4 *^ 
e *4 = —fki, we obtain 


>S' = fli 


— i ^ (hky'^y'' ~ i ^ Uk{y’‘y^ 
tic k 


tV). 


Multiplying both sides by 7 “* from the right and from the left, 
and recalling (301), we find that 


7^S7^ = {Ij — T y «m 7V' ~ i y «4*:(7V — T^T"*) 

tic Y 

= {1} = {1} - r = 5-1; 

M.X 

thus (330) is proved. 

Having found the transformation law^^ for yj/ and we may 
find it immediately for the components of the four-current, that is, 
of the Jn defined by (307). In fact, we have 

Jf/ = —c€\l/'^S~^y^Syl/’^f 

and since we find from (319) that 


we have 


S~^y^S = ^ a^yy\ 


j: 



V 


(331) 


which proves that the transform like the components of an 
invariant four-vector; this completes the proof. 

67. Gauge invariance of the Dirac equations. In the Dirac 
equations the electromagnetic field is represented by the potentials 
V and A. Now it is well known that these potentials are not 
physically determined in a unique manner, since we may also 
attribute to the same electromagnetic field (as may readily be 

A group of four quantities which transform like ^ 2 , ^ 3 , ^4 in a Tjorentz 
transformation is called a spinor (concept analogous to that of tensor). This 
concept of van der Waerden and others constitutes the basis of a systematic 
treatment analogous to tensor calculus (see Gotiinger Nachr. 1029, page 100). 
The components of a spinor do not stand for directly observable physical 
quantities (which would be contrary to the principle of relativity) but are of 
interest inasmuch as they determine indirectly the values of observable 
quantities. 
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F' = F - i A' = A + grad ,,, (332) 


where <p(x, y, z, t) is an arbitrary function. If we use the four- 
potential ^ defined by (302), we can say that the may be replaced 
by = % + dip/dxy.. But since this substitution does not corre¬ 
spond to any physical change, it is necessary that the Dirac equa¬ 
tions be insensitive to this alteration. More precisely, none of the 
observable results furnished by the equations must be altered by 
this change, such as for instance the average values of the electric 
charge and current densities, that is, the Jy,. (On the other hand, 
may change, since it has no direct physical meaning.) This 
property, which we shall now verify, is called gauge invariance 
{Eichinvarianz in German). 

Let us write the Dirac equation in the form (300), replacing the 
by the and ^ by a new function We have, writing the 
operators explicitly by means of (299'), 


^ \27ri dXy, 


c c bx, 


— moc == 


Now, this equation may be satisfied by taking 


2Tri e 

yp' = e * (333) 

since this substitution brings the equation back to (300), as may 
easily be verified. To this there corresponds, by virtue of (303), 


27rt € 

^ 

and, upon forming with these expressions for and the J'^ by 
formula (307), we see that the exponentials cancel and the arbitrary 
function <p disappears. 

To make the statement more general: according to the funda¬ 
mental principle of quantum mechanics (§22), the probability of 
a given result Gn in the measurement of an observable G, when the 
system is in the state is given, as we have seen, by where 

<Pn does not depend on ^ but only on the observable G and its eigen- 
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value Gn. If ^ is replaced by the product \p • <pn will change only 
by a factor of modulus 1, and hence the square of its modulus is 
unchanged. Thus the desired probability turns out to be the same. 

68. The electron in a central field. As an application of the 
Dirac theory, we shall now deal with the problem of an electron 
subject only to the action of an electrostatic central force of any 
type. The relations found in this way will then be used in the 
following section (with the law of force specialized to the Coulomb 
f.yp^) treat the problem of hydrogenlike systems. It will be 
noted that the results obtained correspond qualitatively to those 
found by the modelistic spin theory: that is, by using the Bohr- 
Sommerfeld model, improved by the introduction of the magnetic 
moment of the electron, considered as an ordinary magnetic dipole 
but subject to the spatial quantization rule (see §62 of Part II). 
The Dirac theory, however, in addition to being rationally more 
coherent, yields results which are slightly different and in better 
agreement with experience. 

Denoting, as usual, by = — cF(r) the potential energy of the 
central field in which the electron is located, let us consider a 
stationary state of energy W. The Dirac equations (272) will 
assume the form 


\.dxi dX2 








2in <,-W + U - tux’) , .(S .a\, , 

T-j- *• + W." 

h c ~ da: 2 / dxz ^ 


As we have seen in §46 of Part II, the Schrodinger equation corre¬ 
sponding to this problem has solutions of the type \p == Yimf(r), 
where is a surface harmonic whose indices I and m represent the 
azimuthal quantum number and the magnetic quantum number of 
the electron, respectively. Let us try to satisfy equations (334) 
analogously by taking each of the four of the form 


« m Mr). 


(335) 
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The spherical harmonic is here denoted^^ by Z rather than by Yim 
because it is convenient to normalize it (following Darwin^s treat¬ 
ment^^) in a way different from that in §46 of Part II, in order to 
simplify the expressions. To be exact, we shall take 

zr = ~ 


It may then be verified, using the transformation relations for 
passing from the Cartesian coordinates Xi, X 2 , Xs to polar coordinates 
r, 6 y (py that for the derivatives of expressions of the type /Zf (where 
/ = fir) is any function of r) the following relations hold: 


(s +- 5T1 [(f - 


- (I - m){l 






lym—l 

■^1+1 




— fZf 
dxz 


21 




H- (/ + m){l — m) 


(337) 


With the aid of these formulas we may show without difficulty that 
if we take the four ^ of the form 

^1 = iaiF+(r)Zf\.i, ft = ia2F+ir)Zf+j^, 

fz = afi+{r)Zf, fi = aiG+{r)Z^+^ 


1 


(338) 


(where the a are constants and F+y G+ are two functions of r, 
undetermined for the present), and if we substitute them into (334), 
it then suffices to impose upon the constants a the conditions 

I w, 1 

ai ~ ^ a4 I — m 


for the spherical harmonics to be eliminated from the equations. 

** For simplicity of notation, we shall omit writing the indices of the spherical 
harmonic Z, These will generally be different for the four as we shall see 
later on. 

w Proc. Roy. Soc. A118, 654 (1928). 
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Then the equations reduce to only two (since the first and second 
equations become identical, and likewise the third and fourth), as 
follows: 

_ ^ {-W + U + moc^) _ a, / dG^ _ I 

, h c ^ I -- m\ dr r 

2Tr { — W + U — rrioc^) ^4 ^ , f dF+ 1 + 2 

~h - ^+ 

The constants ai and a 4 remain arbitrary, and we shall take them 
equal to 1 and to — (Z — m), respectively, so that we have 

ai = 1, a 2 = 1, as = I + m + 1, aA = —{I — w), (339) 

and the preceding equations become 

27 r (-TT+ U + moC^) 
h c dr r 

2Ti-W +U - moc^) ^ , dF+ , Z + 2 

T ^ + 

It is to be noted that a solution of the form considered here can 
exist only if m lies between —(Z + 1) and Z (endpoints included); 
otherwise there would occur spherical harmonics whose upper index 
would be larger (in absolute value) than the lower one, and we have 
not assigned any meaning to such symbols. As far as Z is con¬ 
cerned, it may take the values 0 , 1, 2, ... . 

A second way of satisfying (334) consists in taking the yp of 
the form 

^3 = hsG^(r)Zr, ^4 = b4G-(r)Zr^ j ^ 

Substituting into (334) and proceeding as above, we find that if 
we take 


= 0, 

(340) 

F+ = 0. 



6i = (Z + m), 62 =-(?-m-l), ?>8 = 1, &4 = 1, (342) 


these equations reduce to the two following relations in the func¬ 
tions F(r) and G(r ): 


2T (-Tf + U + moc.^) dG^ 1 + 1 

h c dr r 

^ (-W+U - moc^) ^ ^ _ I - 1 

h c ~~ dr r 


( 343 ) 
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Note that these equations differ from (340), which are satisfied 
by and (?+, simply by the substitution of ~(Z + 1) for 1. A 
solution of this form can exist only if m lies between ~*Z and Z — 1, 
endpoints included. Furthermore, there is no solution for Z = 0, 
since there are no spherical harmonics with negative lower index. 

To interpret the two types of solutions found in this way, we 
recall (see §53) that the total angular momentum with respect to 
the z-SLxis corresponds to the operator 

0 0 0 

1 0 0 

0 1 0 

0 0 

Let us now apply this operator to the ^ of the form (338) or 

(341), noting that 

Zf = mZr. 

d(p ‘ 

We find in both cases that 



which shows that all the solutions which we have here considered 
represent states in which the angular momentum with respect to 
the 2 :--axis has a definite value, specifically the value (m + ^)/i/27r. 
Keeping in mind the limits between which m may vary, we see 
that Nz can vary, in case (338), from — (Z + i)h/2T to + (Z + i)h/27r 
and, in case (341), from — (Z — i)h/2w to +(Z — ^)h/2T. These 
results are identical with those obtained from the modelistic theory, 
if Ng is interpreted as the projection of the total angular momentum 
upon the 2 --axis and if the total angular momentum is taken as equal 
to jh/2w, where j is the inner quantum number (see §62 of Part II); 
that is, i = Z + foi' case (338), and j = I — for case (341). 
The first solution therefore corresponds, in the modelistic interpreta- 
tion, to the case of the spin parallel to the orbital angular momen¬ 
tum, and the second solution to the spin antiparallel to that 
momentum. 

Finally, in the case where Z = 0, we have pointed out that the 
solution (341) is absent, which means that Ng can have only the 
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value ^ h/2T (or j only the value ^), which also follows immediately 
from the vector model. 

69. Theory of hydrogenlike systems (fine-structure). We shall 
now apply the results of the preceding section to the case of a 
hydrogenlike system; that is, we shall specialize the function U by 
taking 


U = - 


r 


(344) 


First let us treat the case of solution (338), that is, of ^ 
and let us seek the eigenvalues (for the parameter W) and the 
eigenfunctions of equations (340), limiting ourselves to the case in 
which the energy E — W — nioc^ is negative (case corresponding to 
elliptical orbits). We introduce the notation 


+ = (345) 

and in addition we introduce the fine-structure constant a: = 27 r e^/hc. 
Equation (340) may then be written 


dG+ 

dr 


+ = 0 , 




(340) 


We observe that for r approaching infinity, these equations 
approach the form 


.BW^ + ^ = 0, 


dF^ 

+ V = 0- 


from which we obtain 


dW+ 

dr^ 


- Amw+ = 0 , 


d'^+ 

dr^ 


- A^B^+ = 0. 


Hence the solutions F+ and G+ of (346) will have for their 
asymptotic expression. Since we are looking for solutions going 
to zero for r , we must discard the plus sign. Thus we are led 
to look for solutions of the form 


F+ = e“-‘®’‘(oor'i' + Oir'i'+i + • • •), 
G+ = e-^^'ibory + 6,r>+i +•••)• 


( 347 ) 
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The other possible singular point is r = 0 . There the preceding 
expressions are regular if 7 ^ 0 ; whereas if 7 < 0 , they become 
infinite to the order Such a singularity is acceptable, provided 
that I 7 I < 1 (see §28 of Part II). 

Substituting these expressions into (346) and setting the coeffi¬ 
cients of equal to zero at the same time, we find 

Zaao + (7 ““ 1)^0 = 0, 1 
(7 + Z 2)ao — Zabo — 0; I 

and upon setting the determinant of the (‘.oefficients of these two 
linear equations in ao, bo equal to zero, we find for 7 the equation 

(7 - Z )(7 + Z + 2) + = 0 , 

which yields (discarding the solution with the minus sign, which 
would give 7 < — 1 ) 

T = -1 + \/{l + - Z‘^a\ (348) 

Now upon setting the coeflficient of (5 = 0 , 1 , 2, . . .) equal 
to zero, we find 

—B{Baa Abs) + Zaaa+i + (7 -f- 5 — Z + l)?)»-fi = 0 , 1 /qaq\ 

A(Ba8 — Abt) — Zaba^i + (7 + s + Z + 3)aa-|.i = 0, J 

from which we obtain, multiplying the first by A and the second 
by B and adding, 

ag^i[AZcx. + B{y + 5 + Z + 3)] 

+ ba+i[A(y + s - Z + 1) ~ BZa] = 0. 

It is now convenient to introduce a single constant Cg^i in place of 
a#+i and 6 , 4 . 1 , as follows: 

a, 4 i = c,4-i[^(7 + 5 — Z + 1) — BZa], 
bs+i = —c,4i[i4Za + B{y + s + Z + 3)]. 

Substituting into (349) these expressions and analogous ones for a, 
and 6 „ we find the recursion formula for the c,; 

-Za(A2 - B^) - 2AB(7 + 8 + 1) ^ 

~ Za^ + iy + s + l + 3)('>' + s - / + 1) '• ^ ^ 

The functions (347) will certainly be zero at infinity if the series 
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reduce to polynomials. Calling n' the degree of the latter, we 
must have for this purpose c„' 9^ 0 and Cn'+i = 0, and hence, 
because of (350), 


ZaiB^ - A^) 

Mb 


= y + n' + l, 


or, by (348), 

732 _ 4 2 ____ 

= V'Z(( + + n'. 


Substituting expressions (345) for A and B and solving for W, 
we find 


W = moc^- 


1 + 




(V{i + ^)^ -zv + 7iyj 


-H 


(351) 


and recalling that for the solution with which we are dealing here 
we have j = I + /-j, we obtain 


W = moc' 


1 + 


Z^a^ 


(VU + ~^)' - + n 


r~. 


-H 


C352) 


If we proceed now to consider the solution (341) corresponding 
to j = Z — it is not necessary to repeat the calculation, since 
it is sufficient to note that equations (343) differ from equations 
(340) only by the substitution of — (/ + 1) for L Hence upon 
performing this same substitution in formula (351), we find 


W 


= moc^ 



_ _ 

+ n'Y 



but then introducing i + instead of Z, we are back to formula 
(352). The latter therefore expresses the energy levels in both 
cases. Let us develop it in a power series in a up to terms in 
inclusive (an approximation which is more than sufficient for a 
comparison with experiment), and let us remove the term moc^ 
which represents the rest energy. We then find for the energy E 
(other than the rest energy) the following expression, where we 
have set n = j + and where we have used the fact that 

moc^a^ = 2hcR {R = Rydberg constant): 


En' = 


^Rh^\. /3 

\4 



(353) 
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If we neglect the term in a^, we find again the well-known 
expression for the Balmer terms as given by the Schrodinger equa¬ 
tion (see §48 of Part II). Thus we recognize that the n which was 
just introduced (and which is an integer) must be identified with 
the principal quantum number. If, however, we take the correc¬ 
tion term in into account, we see that upon fixing the value of n, 
we obtain from this formula, instead of the single energy level of 
the Schrodinger theory, n different energy levels, corresponding to 
the n values which may be taken by j (from J 2 to n — J^^). These 
levels, however, lie very close together, since j occurs only in the 
term in a^. This splitting is due to the combined effects of the 
relativistic correction and the spin, which are both contained 
intrinsically in the Dirac equations. To the splitting of the levels 
there naturally corresponds a splitting of the spectral lines, which 
constitutes their fine-structure (see §60 of Part II). The line 
emitted in the transition from a level of principal quantum num¬ 
ber n to one of principal quantum number n' must be made up of 
nn' components, of which some, however, are excluded by the 
selection rules of the inner quantum number,/ (see §64d of Part II). 

The formula just found for the fine-structure of hydrogenlike 
spectra differs from (341) in §60 of Part II (which resulted from 
the application of the relativistic correction to the Bohr-Sommerfeld 
theor}^ only in having the number j in place of k. But since 
both j + /i and k take on the same series of values (from 1 to n), 
the two formulas lead to exactly the same energy levels (although 
associated with different quantum numbers) and hence to the same 
line structure, which agrees quite well with the one which is observed 
experimentally. However, the results of the two theories are differ¬ 
ent with regard to the intensity, which may be calculated, in the 
case of the old theory, by means of the correspondence principle, 
and in the new theory by means of the Dirac theory of radiation. 
We shall omit these calculations, confining ourselves to stating 
that experience is in favor of the Dirac theory. For a detailed 
comparison between the theory outlined above and the observed 
fine-structure of the spectra of hydrogen and He+ (which latter 
lends itself better to an experimental investigation, as was already 
pointed out), we refer to No. 1 of the Bibliography, page 316. We 
restrict ourselves here to reproducing, as an example, a graphical 
2 * See, for instance, No. 1 of the Bibliography, page 447. 
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comparison for the line X4686 of He^, corresponding to the transition 
from n = 4 to n = 3 (Fig. 47). The curve represents the intensity 
as a function of the frequency, as obtained from spectrophotometric 

measurements of the line. The vertical 
line segments indicate the positions of the 
components as predicted by the theory, 
their length being a measure of the cal¬ 
culated intensity, in arbitrary units (the 
comparison concerns only the ratios of 
the intensities of the various components, 
and the distances between them). 

60. States of negative kinetic energy 
and positron theory. The Dirac equa¬ 
tion possesses (as we have seen in §54 
for the particular case of plane waves 
and = 0), alongside any solution repre¬ 
senting a stationary state with positive 
kinetic energy 3,^^ an analogous solution 
with negative kinetic energy. This arises 
from the fact that in relativistic mechanics, the kinetic energy 3 of 
a particle is related to its momentum by the equation 

^ = mgc* + tpS 

which is quadratic in 3, whereas the corresponding relation of 
classical mechanics is linear. Therefore even in classical (non- 
quantized) relativistic mechanics there is a possibility of having 
solutions with negative kinetic energy. Specifically, we obtain from 
the last equation (neglecting to write down the terms of order higher 
than the second in v/c) 

3= ±(m„c* + ^^+ • • 

We see that 3 may vary over two intervals (from — oo to 
and from to +oo) which are separated from each other. 

*• For short, we shall also include the rest energy moc® in the kinetic energy, 
and shall designate it by 3 in order to distinguish it from the ordinary kinetic 
energy T, We have 3 » moc* + T, The total energy 3 + ^7 will then be 
denoted by TT as usual. In the case of §54 it was assumed that 17 ■■ 0, and 
hence W « 3. 
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Therefore, since in non-quantizeci mechanics the kinetic energy can 
only vary continuously, it may not pass from one interval to' the 
other and hence, if the motion is started with positive 5, states with 
negative kinetic energy no longer have to be considered. In 
quantum mechanics, however, these considerations do not hold, 
because we may no longer assert that 5 varies continuously. There 
are cases now in which there is a finite probability, or even the 
certainty, that the electron passes from a state with positive 
C5(> rrioc^) to one with negative 5(< O. Klein has pointed 

out a typical casc,^® in which an electron with total energy W{> 0) 
is directed at a potential step such as the one in Fig. 25, of height 
i7o > IF + moc^. Then, if xf/ to the left of the step is represented 
by (292), it is represented by analogous waves on the right, in 
which p is replaced by a real p' such that 

—— ^ - = — \/mlc^ + 

Hence the waves continue past the step (in sinusoidal form, not in 
exponential form as in the Schrodinger theory; see §40 of Part II), 
where the kinetic energy is 5 == IF — [/"o and is therefore negative. 
This means that the electron has a certain probability, which may 
be considerable, of overcoming the barrier by simultaneously 
passing into a state with negative kinetic energy.^’’' As a matter 
of fact, even without the intervention of external forces like the 
ones considered by Klein, transitions to states of negative kinetic 
energy may occur. From the radiation theory of Dirac we even 
get the result that as soon as an electron is put into a state of 
positive kinetic energy, it would undergo such a transition immedi¬ 
ately and spontaneously, while radiating the energy difference. 

The properties of an electron with negative kinetic energy should 
be rather singular: when placed in an electric and magnetic field, 
it would acquire an acceleration directed in the sense opposite to 
that of an ordinary electron (and would behave in this respect like a 

M Zeit8, f. Physik 63, 157 (1929). 

The analytical reason for this fact is that the plane waves with positive 
kinetic energy do not by themselves constitute a complete system of orthogonal 
functions. In order to have a complete set, we must add the waves with 
negative kinetic energy. Hence, given an arbitrary initial it is not in 
general possible to expand it in Fourier integrals without also employing terms 
with negative kinetic energy. 
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positive electron); what is even more unusual, its energy would 
continue to decrease as its velocity increased. 

In order to show that a negative electron of kinetic energy 
3(< 0) moves as would a positive electron with positive kinetic 
energy —5 in the same field, let us denote by ^ an eigenfunction 
(matrix) of the former, relative to a stationary state of negative 
kinetic energy 3; it will satisfy the Dirac equation 

{Xo^Pk - Pa + moCa^)r/^ = 0, (354) 

with 

= P, = i(W + eV)=-- (355) 

ZttZ dXk c c c 

On the other hand, a positive electron of kinetic energy —3 
would correspond to an eigenfunction satisfying the equation 

(-Xc^P^ + Pt + m^ca^W = 0, (356) 

obtained from the preceding one by changing e into —e and 3 into 
-3. 

We shall now show that (356) is satisfied by taking 

= >S^*, with S = (357) 

In fact, the matrix S defined in this way has the following 
property: 

a^S = >SV*, a^S = (358) 

which may be verified immediately by applying (266) and noting 
that because of (267), the complex conjugates of the matrices a are 
equal to the same matrices, except for which is equal to — 
Hence if we substitute (357) into (356) and take (358) into account, 
the left member becomes 

- Pf + moCa^*)4'*; 

that is, it is equal to the complex conjugate of the left member of 
(354), multiplied by —>8 from the left. Therefore the last expres¬ 
sion is zero by virtue of (354), and (356) is satisfied. Since (357) 
holds for any eigenfunction, it will also be valid for a sum of eigen¬ 
functions; so that if ^ represents a wave packet corresponding to a 
negative electron with negative kinetic energy (which means that 
^ is formed, at lesist predominantly, with eigenfunctions corresponding 
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to negative values of 3), then represents a wave packet 

corresponding to a positive electron with positive kinetic energy. 
And since, as wo may readily verify, the two packets 

coincide and move in the same manner. 

It is not legitimate, however, to go ahead and identify a negative 
electron with negative kinetic energy with a positron. It suffices 
to observe that its kinetic energy as a function of the momentum p 
is expressed in first approximation by 

(359) 

and hence decreases with increasing momentum or increasing 
velocity, which certainly does not correspond to the properties of 
any of the experimentally observed particles. 

From the very beginning, the existence of negative kinetic 
energy states and the impossibility of ignoring them seemed to 
constitute one of the most serious objections to the Dirac theory. 
In order to obviate this difficulty, Dirac proposed^® an ingenious 
interpretation of these states, which led to the prediction of the 
existence of a positive electron and showed the way to calculate its 
properties. This theory, although still not devoid of weak and 
obscure points, has earned great acclaim, especially since the experi¬ 
mental discovery of the positron. Here, in brief, is what is involved. 

Dirac makes the hypothesis that a space devoid of electric 
charges must not be thought of as a space in which there are no 
electrons; rather, it must be considered to contain an infinite number 
of electrons with negative energy, and exactly one for each sta¬ 
tionary state. This particular distribution of electrons, which we 
shall call “normal,” is stable by virtue of the Pauli principle, which 
prevents an electron from occupying a state which is already occu¬ 
pied by another electron. The “normal ” distribution is not to give 
rise to any electric field; for this reason it is necessary to make the 
further hypothesis that in the equation div E = Airp we are always 
to understand by p, not the charge due to all electrons present 
(which would be infinite) but only the difference between that 
charge and the charge due to the “normaP^ distribution. If we 
now modify the normal distribution, by raising one of the electrons 

^^Proc. Roy. Soc. A126, 360 (1931); ibid., A133, 60 (1931). See also No. 6 
of the Bibliography. 
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from one of the negative energy states (represented, for instance, by 
the eigenfunction to a state of positive energy (whose eigen¬ 
function is to be yf'n"), the newly obtained distribution will differ 
from a normal one in two respects: the presence of an electron in 
the state n", and the absence of an electron from the state n'. The 
first of these facts gives rise to all the ordinary manifestations of an 
ordinary electron in the state n", and the ^'hole^^ left in the nega¬ 
tive energy states will manifest itself as an electron with positive 
charge, or positron^ in the state n'. More generally, a positron in a 
general state will be represented by subtracting from the normal 


distribution a term 


(the 


sum being extended over all or 


part of the negative energy states), that is, by subtracting from the 
normal distribution a wave packet with negative kinetic energy, 
which, as we have seen, moves in the same way as an electron of 
charge Thus, interpreting a positron not as an existing wave 

packet but as a missing wave packet, we eliminate the above- 
mentioned difficulty concerning the energy. In fact, if the velocity 
of the positrons increases, the missing kinetic energy of the system 
decreases, or, in other words, its total kinetic energy increases, as it 
should. 

As an obvious corollary of this theory, it should be possible to 
create an electron-positron pair. To a certain extent this phe¬ 
nomenon would be analogous to the ordinary excitation of an atom, 
since it would consist in bringing an electron from a state of negative 
kinetic energy (where the latter does not reveal itself in any way) 
to a state of positive kinetic energy. It is to be noted, however, 
since the energy of the first state is below — and that of the 
final state is above moc^, that the energy necessary for pair-creation 
should ever so slightly exceed 2moc^, which amounts to about one 
million volts. This energy might be furnished by X rays or gamma 
rays, provided they are of a frequency v such that hv > 2moc® 
(that is, i? > 8.3 X 10® cm~^). In that case the photon may 
materialize^'; that is, it may transform into a pair of electrons, 


In the paper cited, Dirac interpreted the “holes'' as protons, since the 
existence of the positron was then unknown. However, it was not explained 
why the mass of the positive particles was so different from the mass of the 
electron. With the experimental discovery of the positron (1932), the Dirac 
“holes" found their most direct and natural interpretation. 



§60] 


RELATIVITY AND SPIN 


443 


one positive and one negative, having between them a kinetic 
energy hv — 2moc2. One may further calculate, on the basis of 
the Dirac theory, the probability for this materialization of radiation 
{pair-creation) to occur, and it is found that the phenomenon is 
possible in practice only when the photon traverses the electric field 
of a nucleus. 

These theoretical predictions are in good agreement with numer¬ 
ous experimental results, principally those of I. Curie and F. Joliot, 
Chadwick, Blackett and Occhialini, Anderson and Neddermeyer, 
and Meitner and Philipp.From these experiments it turns out, 
for instance, that if the gamma rays of thorium C" (whose photons 
have an energy of 2.65 X 10® volts) are made to impinge upon a 
lead screen in a Wilson chamber, positive and negative electrons 
will leave the lead (the latter being considerably more abundant) 
whose sign and velocity may be determined by deflecting them in a 
magnetic field. It is found that whereas the kinetic energy of the 
negative electrons may reach a value equal to the entire photon 
energy, the kinetic energy of the positive electrons is limited with 
sufficient sharpness to not more than 1.6 X 10® volts. This result 
is interpreted as follows: the negative electrons are due, in addition 
to the materialization of the photons, to the -ordinary photoelecjtric 
effect (in which, as we know, almost the entire energy of the photon 
may transform into kinetic energy), whereas the positrons are due 
to pair-creation exclusively. Since this phenomenon absorbs an 
energy of about one million volts, the pair which is formed will 
have a kinetic energy of 2.6 — 1 = L6 million volts (part of which 
is lost in traversing the lead). Hence none of the two particles 
may have a kinetic energy above this limit. In some rare cases 
it has been possible to observe the creation of an electron pair 
(+ and —) within the gas of a Wilson cloud chamber, rather than 
in the lead. In that event the particles are not slowed down 
appreciably, and hence the sum of their kinetic energies must be 
equal to hv — 2moC^, which value is generally well verified. 

The inverse of the preceding phenomenon is the falling of an 
electron from a state of positive energy into one of the few unoccu- 

For a more complete bibliography and for greater details concerning this 
question and the annihilation of electrons, see Nos. 26 and No. 31 of the Bibli¬ 
ography, and the monograph by I. Curie and F. Joliot entitled Uelectron positif. 
Paris: Hermann, 1934. 
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pied negative energy states, that is, the recombination of a negative 
electron with a positron, which both disappear while liberating a 
quantity of energy equal to 2moc2 in the form of radiation (plus 
their kinetic energy, which is generally negligible). This phenome¬ 
non, called annihilationj may take place essentially in two ways: 
(a) a positron encounters a free electron and combines with it, 
emitting two photons in opposite directions (not a single one, 
because the two electrons could not give it the required momentum). 
Therefore each of the two photons has an energy or about 
half a million volts (where the kinetic energy of the two electrons 
is neglected); (b) a positron combines with an electron which is 
strongly bound to a nucleus. In case (b), the emission of a single 
photon is possible, of energy about equal to 2moC^, that is, of a 
frequency double that of case (a), and the recoil momentum is 
imparted to the nucleus. From a calculation of Fermi and Uhlen- 
beck, the probability of process (b) proves to be considerably 
smaller than that of process (a). Other theoretical possibilities for 
annihilation exist, but we confine ourselves here to considering only 
those two, of which the first is confirmed by notable experiments of 
various investigators.^^ Among them we cite only those of Thibaud 
and of Joliot. 

These workers have observed independently that if positrons 
are incident upon lead, aluminum, or platinum, a radiation emerges 
from the bombarded metal whose absorption coefficient in lead 
corresponds to gamma rays of about half a million volts (more 
precisely, the limits assigned by Joliot are from 425,000 to 645,000 
volts). Furthermore, a (necessarily inexact) measurement of the 
intensity of this radiation compared to the number of incident 
positrons has permitted Joliot to conclude that from 1.6 to 3 photons 
are emitted for every incident positron, with a good probability for 
the value 2 predicted from theory. 

The phenomenon of annihilation provides a reason for the fact 
that positive electrons are so much more rarely observed than 
negative ones, and that they do not intervene in the conduction of 
electricity, in the thermionic effect, and in similar phenomena. In 

«Chao, Proc. Nat. Acad. 16, 431 (1930); Phya. Rev. 36, 1519 (1930); Proc. 
Ray. Sac. A136, 206 (1931); Tarrant, ibid. A136, 662 (1932); Gray and Tarrant, 
ibid. 143, 681 (1934); J. Thibaud, Comptes Rendus 197, 1629 (1933); F. Joliot, 
ibid., 197, 1622 (1933), and J. de Phya, V, 299 (1934). 
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fact, a positron in matter (even in a rarefied gas) finds itself in the 
presence of a large number of negative electrons and has a strong 
tendency to combine with one of these. The probability for com¬ 
bination has been calculated by Dirac, who found that it increases, 
tending toward a maximum value upon a decrease in velocity of the 
positron, so that in practice the positron may survive only if it 
possesses a considerable velocity. The mean life of a slow positron 
in water, for example, proves to be 3.5 X sec. 



CHAPTER 15 


Systems with Identical Particles 

61. Identity of elementary particles. The intuitive concept of 
particles such as electrons, protons, and photons as material points 
is incorrect, not only because it attributes to them kinematic 
properties which do not correspond to reality (as was amply illus¬ 
trated in Part I) but also because it assigns them an ‘individuality 
which they do not have in nature. In order to understand this 
assertion better, let us think of a system containing several electrons 
(for example, an atom). If we let ourselves be guided by the 
corpuscular model and if we think of the electrons as tiny balls, we 
can always attach a meaning to the operation of interchanging two 
of these balls, even if they are identical (in the ordinary sense of the 
term). But if we analyze the meaning of this exchange opera¬ 
tionally, we see that it is based on the possibility of attributing an 
individuality to the particle by “marking^' it in some way (even 
so slightly as to leave its fundamental properties unchanged), or by 
following it uninterruptedly throughout the operation of exchange. 
But for electrons (and for any other elementary particle) it is 
conceptually impossible either to mark them or to follow them with 
continuity. For this reason too, then, the corpuscular model is 
inadequate for their representation. The property of “indi¬ 
viduality^^ in this case does not correspond to any physical reality. 
The only reason why we are accustomed to attribute this property 
to elementary particles is that in our mind and our everyday 
language, it is indissolubly associated (just like the concepts of 
position, velocity, trajectory, and so on) with the word “particle,’^ 
by which we usually describe electrons, protons, and the like. Now 
we have seen that as far as the kinematic concepts are concerned, 
the inadequacy of corpuscular terminology may be compensated 
for by the introduction of the uncertainty principle. Similarly, as 
far as the question of individuality is concerned, it is still possible to 
retain corpuscular terminology, but with the express convention 
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that two elementary particles of the same kind are to be considered 
identical in a stricter sense of the word than is ordinarily implied. 
In this sense the expression ^^interchange of any two particles’^ 
loses its meaning, so that the characteristics of individuality are 
removed from the particle concept.^ It is in this special sense that 
the identity of particles, with which we shall be dealing in this 
chapter, is to be understood. As we shall see, this property of 
elementary particles, which has no analogue among ordinary bodies, 
leads to singular consequences, which may not be intuitively inter¬ 
preted by means of the corpuscular model but which in many cases 
are susceptible of experimental verification. 

First let us consider systems with only two identical particles 
(such as, for instance, the helium atom). We shall then briefly 
extend the arguments to systems with any number of identical 
particles. 

62. States of a system with two identical particles. Symmetry 
and antisymmetry. Let us consider a system composed of two 
identical particles (located in a given field), and let us indicate for 
short by gb fke aggregate of coordinates and momenta of one of 
them, and by the corresponding quantities of the other. We 
also include the spin variable in the g, which we shall denote by <r 
and which might, for example, be the (Tz of the preceding chapter 
(hence, for example, g^ stands for the group Xi, yi^ 2 i, cri). Any 
observable of the system must have an expression F(gb pb V^) 
which is symmetric with respect to the two groups of variables; 
that is, the expression must remain the same when the gb p^ are 
changed into the corresponding gb Otherwise, the interchange 
of the two particles would have physical significance. We shall 
indicate the symmetry of the function F by writing, for short, 

F(l, 2) =F(2, 1)/ 

1 In order to clarify this statement by another example of entities devoid of 
individuality in the sense specified above, let us think of a vibrating string in 
which two systems of standing waves are simultaneously excited; one with 
frequency n, amplitude Ai, phase the other with frequency 1 ^ 2 , amplitude A 2 , 
phase The phrase ‘interchange of the system of waves (»'i, Ai, fp\) with the 
sjrstem (?» 2 , A 2 , ^> 2 )” is completely devoid of meaning. In the same way, if 
there are two electrons in an atom, one in a quantum state defined by certain 
quantum numbers ni, fi, mi, the other in a state defined by Uy m 2 , there is 
no meaning whatsoever in saying that the first electron is placed into the 
second state and the second electron into the first. 
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instead of 

V\ <l\ P") = Q.\ P')- 

Hence the operator corresponding to a general observable F will 
also be symmetric. In particular, the expression for the energy 
will be symmetric, and hence also the Hamiltonian operator It 
follows that if ^n(l, 2) is an eigenfunction belonging to the eigen¬ 
value £/n, that is, if it satisfies the equation 

2) = 2), (360) 

then ^n(2, 1) is also an eigenfunction belonging to the same eigen¬ 
value, since the equation is still satisfied if the and the q^ are 
interchanged in 

Some important consequences are derived from this reasoning. 
First let us suppose that En is a simple eigenvalue. In that case 
^„(2, 1) cannot be essentially different from ^n(l, 2); that is, we 
must have 

^n(2, 1) == C^n(l, 2), 

where c is a constant. If we then interchange q^ and q^ in this 
relation and multiply one equation by the other, we find c* = 1, 
or c = ±1. Hence an eigenfunction belonging to a simple eigen¬ 
value either has the property 

^„( 2 , 1 ) = ^.( 1 , 2 ); 

that is, it is symmetric, or else 

^„(2, 1) = 2), 

in which case it is said to be antisymmetric. 

Let us now examine the case of degeneracy, in which we suppose 
that En is a multiple eigenvalue of order p, and let 

2), 2), . . . , 2) (361) 

be a fundamental set of mutually orthogonal eigenfunctions belong¬ 
ing to En (see §6 of Part II). It is known that this set may be 
replaced by any other system obtained from the previous one by 
an orthogonal transformation. We shall now show that we may 
choose the transformation in such a way that the system will be 
made up exclusively of symmetric and antisymmetric functions 
(not merely mutually orthogonal). 
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Let us consider, for instance, the first of the eigenfunctions (361); 
upon interchanging the and we obtain a new eigenfunction 
1) belonging to the same eigenvalue. The latter may coin¬ 
cide (to within a constant factor) with the same 2); and then 

it may be shown, as we have just seen, that it is already symmetric 
or antisymmetric. Or it may coincide with another one of (361), 
for example 2), and in that case we shall replace the pair of 

functions 2), ^tr^(l, 2) by the two (evidently nonzero) 

functions 

2) ± ^<^>(2, 1)], (362) 

which are also linearly independent of each other and of the others; 
furthermore, one of them is symmetric and the other is antisym¬ 
metric. Finally, ^1/^2, 1) may not coincide with any of the func¬ 
tions (361), and then we may substitute for 2) any one of the 

two functions (362), which are both independent of the others. 
Applying this procedure to all the eigenfunctions (361) in suc¬ 
cession, we shall be able to replace them by an equal number of 
independent eigenfunctions, some of which are symmetric and 
others antisymmetric. Their orthogonality remains to be estab¬ 
lished. Let us consider the group of symmetric eigenfunctions. 
It is always possible to perform a linear transformation upon them 
such as to replace them by an equal number of mutually orthogonal 
independent eigenfunctions (sec, for example, §G of Part II) which 
will evidently also be symmetric. Similarly, the antisymmetric 
eigenfunctions may be replaced by an equal number of their linear 
combinations which are independent and mutually orthogonal (and 
evidently antisymmetric). Finally, we note that any symmetric 
function and any antisymmetric function are necessarily 
orthogonal, since the integral 

= /^r(l, 2)Mh2) dq^dq^ 

must not change when the variables q^ and are interchanged. 
On the other hand, this interchange only changes the sign of the 
integrand; hence the integral vanishes. Therefore we recognize 
that the p eigenfunctions, which are partly symmetric and partly 
antisymmetric, and which we have substituted for the function 
(361), are all mutually orthogonal. 

If for any possible multiple eigenvalue we select the fundamental 
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set of eigenfunctions in the manner described, we obtain a complete 
set of orthogonal eigenfunctions, which contains only eigenfunctions 
which are either symmetric or antisymmetric. 

Let us now proceed to consider in general the possible (also 
nonstationary) states of the system. Let us assume, as is physically 
plausible, not only that the probability density P = must be 
symmetric with respect to an exchange of the two particles, but 
also that if ^(1, 2) represents a possible state of the system, then 
^(2, 1) will also represent a possible state. It may then be shown 
that ^(1, 2) must necessarily be either symmetric or antisymmetric. 
A which does not have this property (such a xp might be obtained 
by a linear combination of symmetric and antisymmetric eigen¬ 
functions) does not represent a physically possible state of the 
system, though it may satisfy the time-dependent Schrodinger 
(or Dirac) equation. 

Let i/'(l, 2) be neither symmetric nor antisymmetric. If it 
represented a possible state, so would ^(2, 1), and hence their 
(nonzero) symmetric and antisymmetric combinations: = 

HlHh 2) + ^(2, 1)], V'a = mih 2) - ^(2, 1)]. But then any 
other combination of the type ^ + e'Va (where d is an arbi¬ 

trary constant) would also represent a possible state. This con¬ 
dition is in contradiction with our first hypothesis, since the proba¬ 
bility density of that state, 

p = + e-'VaV. + 

is not symmetric in general. In fact, if the variables and are 
exchanged, p is changed into 

p' = \pgXpf “ H" • 

The difference p — p' = 2{e^ypjp* + does not vanish in 

general, in view of the arbitrary nature of 6. Hence if the state 
^(1, 2) were possible, states would also be possible whose p is not 
symmetric; this condition is to be excluded. 

Hence we have shown that the a priori possible states for the 
system fall into two classes which we shall call symmetric states and 
antisymmetric states, according to whether xp is symmetric or anti¬ 
symmetric (implying: with respect to an exchange of Xi, yi, zi, cri 
and X 2 , 2 / 2 , Z 2 , <t 2 )- 

We shall now show that the system can never pass from a sym-^ 
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metric state to an antisymmetric state, or vice versa; and that conse¬ 
quently, in the evolution of a given system, states of only one of 
these two classes may occur (as we shall see later, for electrons and 
protons the antisymmetric states are the only ones which arise). 
In fact, from the fundamental equation 




h ^ 
2Tri dt^ 


(363) 


we find that the increment of ^ in the time dt is 

^ dL (364) 

Now since § is a symmetric operator, if yp is symmetric (or anti¬ 
symmetric) at a given instant t, the same will be true for dp, and 
hence also for p + dp. Therefore p at the time t + dt has the same 
symmetry character which p had at time L 

63. Note on the extension to N identical particles. If the 
system consists of N identical particles, and if we denote by p] 
the coordinates and momenta of the ith particle (including the spin 
variables), it is obvious that any quantity having physical signifi¬ 
cance, and in particular the probability density \p\^, must be sym¬ 
metric with respect to all the particles; that is, it must remain 
unchanged when, in any manner whatsoever, the indices character¬ 
izing the variables are interchanged. In particular, from the fact 
that the Hamiltonian 5C possesses this property, we deduce, by the 
same procedure which was used in the case of two particles, that if 
in an eigenfunction pn{q^y q^, . , . q^) we permute the indices of 
the variables in any manner, then we obtain an eigenfunction 
belonging to the same eigenvalue. If the eigenvalue is simple, this 
eigenfunction must coincide with the original one to within a con¬ 
stant factor, which (as may be shown as in the previous case) can 
be only ±1. We shall prove that this factor is either +1 for all 
permutations {symmetric function), or +1 for the even permutations® 
and —1 for the odd permutations {antisymmetric function). 

First let us consider a transposition (r, s), that is, the exchange of 
only two indices, r and s, and let us indicate by the factor (= ±1) 

* A permutation is even or odd according to whether it may be obtained by 
an even or odd number of transpositions, that is, of exchanges of only two 
elements at a time. 
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by which the function is multiplied as a consequence of this 
exchange. We then note that the transposition (r, s) may also be 
obtained by the successive application of the transpositions (r, t), 
(t, s), (r, t)j where t is any other index. Therefore between the corre¬ 
sponding factors C we shall have the relation Crs = CrtCuCrt = 
if Crs = Cu no matter what the indices r, s, ty 
then all the C are equal to one another. Hence they are either all 
equal to +1, in which case any transposition, and therefore also 
any permutation, leaves the function unaltered; or else they are 
all equal to —1, and then an even number of transpositions will 
leave the function unaltered, and an odd number will change its 
sign—that is, we are dealing with an antisymmetric function. Note 
that in this argument w^e have not made use of the equation which 
ypn satisfies; and hence it holds not only for an eigenfunction but 
also for any function of the coordinates which, upon any trans¬ 
position of the indices of the latter, is multiplied by ± 1. 

If the eigenvalue is multiple, a permutation of the indices in 
will change it into a different eigenfunction, in general independent 
of the former one. However, it may be shown that for any multiple 
eigenvalue of order p there exists a set of p (independent and 
orthogonal) eigenfunctions of which one is symmetric, one anti¬ 
symmetric, and the others more complicated in behavior. How¬ 
ever, these do not interest us here because, as we shall see, they 
do not represent physically possible states.® 

Let us now consider some general state, even nonstationary, 
represented by a certain We assume (as in the case of two 
particles) that when the coordinates of any two particles are inter¬ 
changed, the ^ is changed into another function representing a phy¬ 
sically possible state. Hence w’^e may apply the reasoning of the pre¬ 
ceding section to each of these exchanges or transpositions, and we 
reach the conclusion that ^ must be either symmetric or antisym¬ 
metric with respect to any transposition. Therefore (since we have 
just seen that a function may not be symmetric for some transposi¬ 
tions and antisymmetric for others) ^ must be either symmetric or anti'- 
symmetric (with respect to any permutation of the indices). Corre¬ 
spondingly, we shall speak of symmetric states and antisymmetric states; 

* The general study of the symmetry properties of the eigenfunctions of N 
particles was made with the methods of group theory by E. Wigner, Zeits, /• 
Physik 40, 883 (1927). 
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states which do not fall into one of these two categories are not 
possible. 

The arguments presented at top of page 451 may be extended 
immediately to the case of several particles, in order to show that 
the system cannot pass in any way from a symmetric to an anti¬ 
symmetric state, or vice versa. 

64. The Pauli principle and the antisymmetry principle. In the 

Bohr-Sommerfeld theory of the atom, as we know, three quantum 
numbers were assigned to each electron—the principal quantum 
number ni^ the azimuthal quantum number Z*, and the magnetic quan-- 
turn number m ^—which completely define its orbit. If we add to 
these, in accordance with the hypothesis of the spinning electron, 
the spin quantum number Si (capable of having only two values, 
corresponding to the two opposite orientations which the spin may 
take), the quantum state of each of the electrons in the atom will be 
specified by a quadruplet of quantum numbers. These numbers are 
involved in all laws which determine the behavior of the electron 
in question (for example, in the selection rules) and are therefore 
closely related to the chemical and spectroscopic phenomena of the 
outermost electrons (valence and emission electrons) and to the 
emission and absorption of X rays for the inner electrons. Hence 
it is possible to check their values experimentally. 

Actually, in §56 of Part II, we have defined the quantum num¬ 
bers only for a single electron in a central field. In an atom with 
several electrons, their interaction complicates the problem. In 
first approximation we can, as was seen in §59 of Part II, idealize 
the influence exerted upon the ith electron by the remaining ones 
by considering that a central force is added to the force due to 
the nucleus, whereby we are led back to the previous situation. 
But even independently of any approximation, it is always possible 
to define a group of four quantum numbers n», Z,, mi, Si for each 
electron in the following manner. Let us suppose that the atom 
is placed in a magnetic field whose value, starting from zero, 
increases slowly until the forces which it exerts upon the electrons 
become predominant over all their mutual interactions, and over 
the action of the orbital magnetic field upon the spins. It may 
be shown that if the atom was originally in a given quantum state, 
it will remain there even in a strong magnetic field. But in that 
field each electron may be quantized separately, and hence its four 
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quantum numbers may be assigned in the usual way. Since the 
system was brought continuously from a zero magnetic field to a 
very strong field, whereas the quantum numbers cannot vary with 
continuity, we must attribute to each electron in the atom not 
subject to the field, the same quantum numbers which belong to it 
in a sufficiently intense magnetic field. 

Now it was discovered empirically (for the first time by Pauli 
while he was studying the spectra of the alkali metals)^ that in a?i 
atom there never are two electrons having the same group of four quart’- 
turn numbers^ This fact cannot be deduced from any other law of 
quantum theory; but since it is consistently borne out by all atoms 
(and also by more complex systems, such as molecules), it was 
assumed as one of the fundamental principles of atomic physics 
under the name of Pauli principle or exclusion principle^ and has 
proved extremely fertile in consequences in full accord with experi¬ 
ence.® It may be said that the entire theoretical interpretation of 
spectra is based upon this principle and constitutes an impressive 
confirmation thereof. In the present volume, we shall have occa¬ 
sion to cite only one of these applications (see §67), 

The Pauli principle has furnished the key for the interpretation 
of one of the most fundamental laws of nature, namely, of the 
periodic system of Mendelyeev. For a detailed presentation of 
this interpretation we refer to a work on spectroscopy (see footnote, 
page 252), We confine ourselves here to a brief schematic and 
simplified outline. If the Pauli principle did not hold, each electron 
would tend to be located in the orbit of least energy, which is 
(at least for the lighter elements, to which we restrict ourselves 
here) the orbit with n = 1,Z = 0, m = 0. Therefore all the elec¬ 
trons of the atom in the ground state would have these quantum 

♦ Zeits, f. Physik 31, 765 (1925). 

^ The fourth quantum number s can take on only two values. Thus when 
the values of the three quantum numbers n, Z, m (called ^‘orbitaP' quantum 
numbers) are fixed, there may be at most two electrons having these three orbital 
quantum numbers (or, as is said for short, having this orbit”). This is 
the form in which Pauli stated his principle for the first time. The two elec¬ 
trons having the same orbit must then differ in their quantum number s, that 
is, they must have their spins in opposite directions. 

® The Pauli principle is assumed for any system containing several electrons; 
for example, Fermi statistics,” valid for an electron gas, and in particular for 
the conduction electrons in metals, is based upon this principle. There is 
reason to believe that the Pauli principle holds also for protons and some nuclei. 
For others, however (for instance, for a-particles) it is not valid. 
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numbers (or, in spectroscopic terminology, they would be in the 
Is-state), and the optical and chemical properties would vary 
progressively from one atom to the next, corresponding to the 
increase of the number of these electrons. In particular, the funda¬ 
mental term of the spectrum would always be Is, All this is in 
absolute contrast with reality. Instead, according to the Pauli 
principle, in the afore-mentioned orbit there cannot be more than 
two electrons. Therefore in the lithium atom (atomic number 
Z == 3, hence three electrons), two electrons find themselves in the 
first orbit, but the third must be located in the next orbit (in order 
of increasing energy), which is n = 2, Z = 0, m = 0. This electron 
is therefore less tightly bound to the nucleus than the others, and 
hence acts as valence electron (hence Li is monovalent) and as emission 
electron (hence the fundamental term is 2 s). In Be (Z = 4) a 
second electron occupies the second orbit (hence valence of two, 
and fundamental term 2s). In B (Z = 5), the first two orbits 
being completely occupied, the fifth electron will go into an orbit 
n = 2, Z = 1, which explains why the ground term of boron is 2 p. 
Thus, by adding one electron at a time to the lowest energy state 
which remains free, compatible wdth the Pauli principle (and of 
course, increasing the charge of the nucleus by one unit every time), 
we determine the quantum state of the various electrons in suc¬ 
cessive elements. In particular, the quantum state of the elec¬ 
tron added last defines the ground (fundamental) term of the 
spectrum. The predictions obtained in this way are in remarkable 
accord with the experimentally observed properties of the different 
elements. The existence of the so-called ‘‘periods^' is explained 
by the fact that the successively added electrons occupy (as a rule) 
first the orbits with Z = 0, then with Z = 1, then with Z = 2, . . . , 
up to n — 1, which is the maximum Z compatible with a given n. 
After this we pass to the orbits with n + 1, starting once again 
with Z = 0, Z = 1, and so on. And since it follows from a simple 
calculation that at a fixed n there are orbits, that is, 2 n^ places, 
we shall have a first period of 2 * P = 2 elements (H and He), a 
second one with 2 * 2^ = 8, and so forth. 

For further particulars, see, for example. No. 27 of the Bibli¬ 
ography. 

In the statement of the Pauli principle, we have thus far used 
the terminology of the Bohr-Sommerfeld theory. But in wave 
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mechanics, this principle has received an even simpler and more 
general formulation by the work of Dirac, since it also applies to 
nonstationary states. This formulation states: in a system con¬ 
taining several electrons^ only the antisymmetric states are 'possible 
(in the sense explained on page 453). But neither does wave 
mechanics provide any proof of this principle (which we shall call 
antisymmetry principle) ^ and hence it must be considered a postu¬ 
late. However, wave mechanics shows (as we have seen) that if 
a system obeys this principle at a certain instant, it will satisfy it 
forever. 

We shall now show that the Pauli exclusion principle follows as 
a necessary consequence from Dirac's antisymmetry principle. Let 
us suppose that the system is in a stationary state, and let us place 
it in a magnetic field whose intensity, which is very weak at first, 
we shall increase up to a value such that we may neglect the inter¬ 
actions between the various electrons, as we have pointed out on 
page 453. Let yl/miq') be the eigenfunction of the fth electron 
{Ui stands for the group of four numbers Ui^ Z», miy «», and g" stands for 
Xiy yiy Zii <Ti). Since the different electrons are dynamically inde¬ 
pendent, the equation of the total system has for eigenfunction 
the product of the eigenfunctions corresponding to the separate 
electrons which compose the system (see §20), that is, 

V'nj (g*) • • • But since the equation is symmetric with 

respect to the N groups of four coordinates, it will also be satisfied 
by any function (corresponding to the same eigenvalue) which is 
obtained from the previous one by any permutation of the q\ 
namely by 

(365) 

where ri, r 2 , . . . Tn represents any permutation of the numbers 
1, 2, . . . iV. The general solution will be a linear combination 
of all the solutions obtained in this manner. Of these combinations, 
one will be symmetric, which is the sum of all the N\ products of 
the type (365): 

yp, = . . . ypnA^^) (366) 

(the summation being extended over all permutations ri, r 2 , . . . Vn) ; 
and one combination will be antisymmetric and may be written in 
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the form of a determinant as follows: 

. . . ^^^nXqn 


^nAq^) ^Pnsiq^) . . . ^nAqn 

[In fact, the development of this determinant is just the sum of all 
expressions of the type (365), preceded by a plus or minus sign, 
according to whether the permutation ri, r 2 , . . . Vn is even or 
odd. The expression is then evidently antisymmetric, since the 
exchange of any two q, which is equivalent to an exchange of two 
columns of the determinant, only changes the sign of the latter.] 
If we now assume the antisymmetry principle, we must adopt ypa as 
the only eigenfunction possible among all those mentioned above. 
Under this assumption, if there are two or more electrons in the 
system with the same quantum numbers, there would be two or 
more equal rows in the determinant, and hence would be identi¬ 
cally zero. Hence the antisymmetry principle leads to the exclusion 
of the possibility of several electrons in the same quantum state, 
and contains the Pauli principle. 

Conversely, the antisymmetry principle is a necessary conse¬ 
quence of the Pauli principle, provided the latter is understood in a 
somewhat more restricted sense than that presented previously. 
This follows if we assume, not only that two electrons cannot exist 
in the same quantum state of the system, but also that it is not 
possible to bring the system into such a state, no matter what the 
forces to which it is subjected (provided, of course, that they act 
upon all electrons in a symmetric manner). In fact, if ^ were 
symmetric instead of antisymmetric (we have already seen in §63 
that no other cases can present themselves) it would be possible, 
by an appropriate (symmetric) Hamiltonian, to make it evolve in 
such a way as to coincide, after a certain time, with the symmetric 
eigenfunction given by (366), in which the indices ni, n 2 , . . . tin 
have been fixed in any manner whatever, even with any number of 
them equal to each other. 

Therefore not only does the antisymmetry principle contain the 
Pauli principle; it also constitutes the necessary condition to ensure 
its permanent validity in a system, no matter in what physical 
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circumstances the system finds itself. In that sense, the anti¬ 
symmetry principle may be considered equivalent to the Pauli 
principle and is actually often called by that name. 

66. Approximate calculation of the states of a system with two 
identical particles. In a system c.onsisting of two particles, each of 
the parti(*les may be considered to be acted upon by: (a) forces 
independent of the presence of the second particle; (b) (electric 
and magnetic) forces due to the other particle. In many cases, 
these latter forces may be considered small with respect to the 
former, so that wo may neglect them in a preliminary approximation 
(zero-order approximation); we introduce them later into a first 
approximation by the perturbation method and, if need be, into 
higher approximations. As we shall see, we may treat the helium 
atom and the hydrogen molecule by this method, these two cases 
being the most interesting applicant ions of this theory.'^ 

First let us deal with the zero-order approximation; that is, let us 
neglect the interaction between the two particles. We can then 
consider each of them as belonging to a different system (ignoring 
the spatial superposition of the two systems); and, in a stationary 
state, we may attribute to the first particle an eigenfunction 
to the second an eigenfunction (where, as usual, ni, n 2 repre¬ 

sent two groups of four quantum numbers, and 1, 2 represent two 
groups of four coordinates). These eigenfunctions satisfy the two 
equations 

(368) 

where we have denoted by the Hamiltonian relative to the first 
particle, and by §§ fhat of the second particle (neglecting their 
interaction). These operators, of course, have the same form, and 
the second differs from the first only by the mathematical substi¬ 
tution of the letters X2, 2 / 2 , 22 , 0-2 for Xi, 2/1, Zi, cn. Hence and 
are obtained, in substance, by solving one and the same eigen¬ 
value problem. It suffices to write the following equation, which 

7 In order to obtain greater precision in the zero approximation, we may 
also include part of the interaction force, representing it schematically as a 
central field acting on each particle and equal, approximately, to the average 
value of the force exerted upon it by the other particle {screening action; see §59 
of Part II). The perturbation then reduces to the difference between the true 
interaction and this force. This result does not substantially alter anything 
of what follows. 
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is the same as (368) but lacks indices, 

= EKx, (t), (369) 

and then to take one of its particular eigenfunctions and to 
affix the index 1 to the variables in it. Similarly, we take another 
particular eigenfunction (which may possibly coincide with the 
former) and write the variables with index 2. Em, E^ are the 
corresponding (possibly equal) eigenvalues. If we now consider the 
two particles as belonging to a single system (but still neglect 
mutual interactions), we know from §20 that to the system there 
belongs the operator and that its state, when the first 

particle is in the state rii and the second particle is in the state ^ 2 , 
is expressed by the product of the two respective eigenfunctions, 
that is, by the following 

(370) 

The corresponding energy is 

E^ = En. + (371) 

Now, because of the fundamental facts established in §62, if we 
exchange the variables and in function (370), we again obtain 
an eigenfunction of the system, belonging to the same eigenvalue. 
Indicating it by have 

V, = (372) 

Let us now suppose that the group of four indices rii is not 
identical to the group n 2 , that is, that the two particles are in 
different states. Then y/zl will result different from and hence 
the energy level (371) will be double (independently of other possible 
degeneracies, which we are ignoring). Therefore: in the zero-order 
approximation there occurs a special degeneracy, deriving from the 
identity of the two particles, by virtue of which all energy levels turn 
out to he double, except for those corresponding to quantum states which 
are equal for both particles. This degeneracy is called exchange 
degeneracy or resonance degenercwy, for reasons which will be 
explained shortly. 

Let us now pass to the first approximation, taking into account 
the interaction between the two particles, which we shall introduce 
as a perturbation, following the methods of Chapter 13. The 
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Hamiltonian of the system will then be written in the form 

+ m + 8, (373) 

where the operator 8 corresponds to the interaction energy of the 
two particles, and hence contains the variables and symmetri¬ 
cally; for example, if the interaction reduces to the electrostatic 
repulsion, then 8 = where 


ri2 = V(^^2 - + (?/2 - yiy + {Z2 - Zi)^, 

However, if we take the magnetic interactions into account, 8 will 
also contain the spin variables ai and 0 - 2 , but always in a symmetric 
manner. Because of the exchange degeneracy, we must apply the 
formulas found in §39, and hence we must first construct the 
^^perturbation matrix,’’ whose elements are 

Ln = 

8 \ai 

Ln = 21 

8l8i 


Ln = 2 

8l8i ( 

^22 = 2 rf-S, 


(374) 


where every integral is sixfold and dS stands for dxi dy\dzi dx^ dy 2 dz^y 
whereas the summations are double and stand for the integration 
with respect to the two discontinuous variables si, 82 (spin quantum 
numbers), each of which takes on only two values. Then we note 
that it is legitimate to interchange, in each of these expressions, the 
designation of the variables of integration and of the indices of 
summation, without changing the result. In fact, if we interchange 
the variables and q’^, changes into yply and vice versa; and hence 
Lii changes into L 22 , Lu into L 21 , and vice versa. Therefore 


ill =L 


22| 


Li2 = Z/2 


(375) 


In first approximation, the perturbation e' of the value of the 
energy is found, as we have seen in §39, by solution of the equation 


Ln - €' 
L 21 



(376) 


which, by virtue of (375), yields the following two values for e' 
(the reason for the notation e', c' will appear presently): 


cj ~ Ln + Li2, 


= 1^11 Li2* 


(377) 
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Eg = Enx + Ent + I/ll + 1 / 12 , (378) 

Ea = Eni + Ent + I/ll 1 / 12 . (378') 


Let US now seek the eigenfunctions of the zero-order approxi¬ 
mation corresponding to these eigenvalues. They are given by 
(see §39) 


r. = c^n^i + c^i2K 


(379) 


where the coefficients are obtained from (185'), which in the 
present case are written, taking A: = 1 (for A: = 2 one would obtain 
an equivalent system), 

cli(Ln - ei) + cf^I/n = 0 
cULn - 6 ') + cl,Lu = 0 , 

or, if we take (375) and (377) into account and assume that we 
have Lu ^ 0,® 

^11 ■“ ^ 12 } ^21 ““ <' 22 * 

Upon normalizing anc^ ypa, we find that the modulus of these 
coefficients must be 1/v^2, so that we may write 




(380) 

or 


(381) 



(381') 


As can be seen, the first of these eigenfunctions is symmetric; the 
second, antisymmetric. This result could have been foreseen 
because of what was said in §62. In the case where the particles 
are electrons or protons, the Pauli principle excludes the first of 
these, and only remains to represent a possible state of the system. 
Hence (378') alone represents all physically possible energy levels. 

* The case Lu =■ 0 is realized when the two probability clouds represented 
by the functions have no points in common: in that case a separate 

region of space is assigned to each of the two electrons, and it is as if each of 
them had its own individuality. Hence no exchange phenomwion occurs. 
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The case, excluded until now, in which both particles have the 
same quantum numbers (ni = does not give rise to degeneracy. 
Hence there exists a single eigenfunction in that case, and it is 
symmetric. Evidently we must exclude these states when dealing 
with particles obeying the Pauli principle. 

We shall now briefly outline an argument which, though of 
purely formal value, explains the name of exchange or resonance 
phenomena, which is generally given to phenomena characteristic 
of systems with equal particles.® Let us suppose that the two 
particles in question, though dynamically equal, may be dis¬ 
tinguished in some manner by means of a ^^mark^’ which does not 
alter their mechanical properties (a supposition which is evidently 
fictitious and devoid of physical significance). Under this hypothe¬ 
sis it is possible to distinguish between the case in which particle 1 
is in state n\ and particle 2 in state and the case in which they 
arc interchanged, that is, between the two states of the system 
represented by and ^2 zero-order approximation. Besides 

xps and ypaj there will then also be admissible any of their linear 
combinations 

+ Ca\pay (382) 


with Cg and Ca constant. Since to \pg there corresponds the eigen¬ 
value Eg, and to xpa the eigenvalue Ea in first approximation, these 
eigenfunctions may be written in the form 


2irt 


^8 = Ug e 


Ed 


= Ma e * , 


where Ug and are functions of the coordinates but not of t, and 
Eg and Ea are given by (378) and (378'). Substituting in (382), 
and putting 


Cg — CgC 




Ca = Cae ^ , 


(383) 


we have 

yp - {CgUg + CaUa) e ^ . (384) 


It is apparent that Cg and Co are not constants like c, and Co, but 
vary periodically and slowly. In fact, their frequency (Ln ± Li^/h 
is small compared with the frequency E^/h of the principal factor 
of yp, since the perturbation is small compared with the energy. 

® See W. Heisenberg, Zeits. f. Physik 38, 411 (1926). 
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If the solution (384) is considered for a time that is short compared 
with the period of variation of and Ca, so as to enable us to regard 
them as constants, this solution can be approximately identified 
with the linear combination 

^ + CWa (385) 

of the solutions of the zero-order approximation (380). In other 
words, our \l/ may be considered to be a linear combination, with 
slowly varying coefficients, of ^2 and ^2? and hence also of and yj/l- 
In fact we have, after substituting (380) into (384), 

^ (C, + CaWx + ~ {C. - CaW.. 

Let us now suppose that we have found out, at time i = 0, that 
particle 1 is in state Ui and particle 2 in state n 2 , that is, that ^ is 
represented (approximately) by Then we get from the pre¬ 
ceding equation and from (383), setting t = 0 and ^ = ^5, that 
Cs = Co = 1/ \/2, and hence for any time t we may write 

^ = ecos y Lnt + #2 sin ^ j- (386) 

Now the square of the modulus of the coefficient of ^5, or 
cos^ { 2 ir/h)L\ 2 tj represents the probability of finding the system in 
the state at time t, that is, particle 1 in state ni and particle 2 
in state n 2 , while sin^ analogously represents the proba¬ 

bility of finding the two particles interchanged. We see that at 
time t/2, where r = h/ 2 L\ 2 j the two particles are found to be inter¬ 
changed with certainty; at time t they are again as they were at 
time ^ = 0; and so forth. At intermediate instants, the state of 
the system is such that we do not know which particle is in state Ui 
and which in state The smaller the coupling between the two 
particles, as measured by the integral L 12 (the “exchange integraP^), 
the larger is the period r with which the two probabilities oscillate. 

This curious exchange phenomenon is analogous, as far as its 
analytic aspect is concerned, to the periodic energy exchange which 
takes place between two oscillators of equal frequency which are 
loosely “coupled'^; for example, between two pendula of the 
same length hung from a common horizontal wire which is not 
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too taut. This mechanical analogy has suggested the name 
resonance.’’ 

66, Separation of the spin coordinates from the position coordi¬ 
nates. In many cases in a system with two electrons, the forces 
due to the spins are negligible in first approximation. Formally, 
this statement means that the spin variables appear neither in 
nor in the main part of the interaction term S, but only in a terra 
of higher order, which is being neglected in a first approximation. 
When these conditions are realized, it is said that there is Russell- 
Saunders coupling between the two electrons. 

The first of these hypotheses enables us to treat (as was seen 
in §45 in the Pauli approximation) the unperturbed problem by 
separating the spin variable a from the position variables, that is, 
by writing a general solution of (369) in the form 

(387) 

where i == 1, 2, and Ui now represents only the group of three 
orbital quantum numbers of the zth electron, and the spin quantum 
number s* is considered separately. Similarly, Xi stands for the 
three position coordinates x*, yiy Zi, The factor satisfies the 
Schrodinger equation 

(388) 

and the spin eigenfunction (p^ reduces essentially to a group of two 
constants a,* and (corresponding to the two values ± 1 of the 
variable <7-t, respectively), so that we may write, designating by s an 
index which can take on the two values ± 1, 


<P,(<t) = 


ot. 


Note that since s can take only two values, there exist only two 
^^functions or pairs (a„ fit). Let us assume these “functions” 
to be orthonormal; that is. 




After these preliminaries, the two eigenfunctions of zero-order 
approximation, namely, the symmetric and antisymmetric eigen- 
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functions (381) and (381'), may now be written as follows, making 
use of (387): 

[’*'«,(a:i)'i'n.(x2)^..(ffi)^..(<rs) 

+ 'i'„,(X2)'I"„,(Xl)¥>,.((T2)v5.,(o'l)], 

il = [’J'n.(Xl)’*'„,(X2)^.,((ri)^.,(<r2) 

v2 

— ■^ni(X2)’l'n,(a:i)v>„(<r2)<?.,(<ri)]. 

With the eigenfunctions of position ^ we now form the following 
combinations, of which the first is symmetric and the second 
antisymmetric: 

g(xi, Xi) = ^['I'„,(a;i)'l'„.(x 2 ) + ’i'n,(a: 2 )'J'n,(a;i)], | 

Ct(xi, X 2 ) = i['*'ni(a:i)'4'„,(x2) — ’J'„.(x 2 )^n,(xi)], / 

and similarly with the spin eigenfunctions: 

s(<ri, 0-2) = + (p„{(r2)<p.,{<ri)], | 

0(0-1, 0-2) = ■ 5 r[y’.,(o-i)#>.,(o- 2 ) — ¥’.,(o2)¥>.,(o-i)]. I 


(390) 


(391) 


Then ^2 and ^2 may be expressed in terms of these new combina¬ 
tions, and they become 

r. = V2 (Ss + ao), (392) 

V-S = V2 (go + Os). (392') 

Note now that since we neglect the forces due to the spins, 
the eigenvalues will result independent of the spin quantum num¬ 
bers Si, S 2 , depending only on the remaining quantum numbers ni, ?i 2 . 
From this condition is derived a further degeneracy, which we shall 
now investigate. 

If Til and n 2 are fixed, four possibilities exist for Si and S 2 : 


(1) Si = +1, S 2 = +1 \ 

(2) Si = —1, S2 = -1 I 

(3) Si = +1, S 2 == —1 i 

(4) Si = —-1, S 2 = +1, / 


(393) 


to which there correspond an equal number of symmetric eigen¬ 
functions of the type (392), which we shall denote by yp], i/'J, 
respectively, and an equal number of antisymmetric eigenfunctions 
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of the type (392'), which we shall call 4^1, ypl, 4/1, 4/i They all 
correspond to the eigenvalue Em + E^ in zero-order approximation. 

However, when the perturbation ^ is introduced, these eigen¬ 
functions must be replaced (see §39) by an equal number of appro¬ 
priately selected linear combinations, which we designate by ypi, 4^^ 
(z = 1, 2, 3, 4). The 4^i will be formed by the 4^8 alone and the 4'i 
by the ypa alone, since, as we have said in §(52, a linear combination 
of a 4^s with a \pa physically not permissible. Hov ever, since we 
are dealing with electrons, which obey the Pauli principle, we must 
exclude the symmetric eigenfunctions ifj, and hence we shall deal 
only with the from now on. They are given by the formulas 

J 

where the coefficients c'’ are obtained (see §39) by means of the 
four systems of linear equations 

(L" - «')c** + + L'V’ + 

+ (L’“ - 

' 

- e')c'* 

and the elements D‘ of the perturbation matrix are given by 

IJf = If dS. (395) 

<ria2 

€i is one of the roots obtained by equating the determinant of the 
coefficients to zero, and the eigenfunction belongs to the 
eigenvalue 

Eni + Em + 

In order to calculate let us use (391) to find the spin eigen¬ 
functions which correspond to the four pairs of values (393) for 
Si and S 2 . We obtain 

Si = (P\{(Ti)(P\{(T2) Sz = <^-i(cri)^...i(or2) 

S3 = S4 = i[';Pi(<^i)^-i(o’2) + (Pi{(r2)<P-i{<ri)] 

ai = 0 az = 0 

as = — a4 = i[<Pi((Ti)(P-i((rz) “ (pi(az)ip^i((Ti)], 




(396) 
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If we take (392') into account and recall that 

dS = dS - 0 (397) 

(because of the antisymmetric nature of the integrand), we can 
write (395) as 

= 2/., ^ afai 4- 21 a ^ sfsi, (398) 

<r\ai ffiffj 

where we have put 

Is = Js*»§d.s:, IA = (399) 

The two double summations for the various pairs (j, 1) are 
calculated by use of (391) and (389), and we find the following 
result for the matrix of the D^: 

2/a 0 0 0 

0 2/a 0 0 

0 0 (/a + /.) (/a - Is) 

0 0 (Ia-Is) (Ta + Is) 


(399') 


The secular equation which yields the may then be written as 

I A Is — € I A — Is 

I A Is I A + Is — € 


(2/^ - €)^| 
and its roots are 


0, 


= ,2 ^ ,3 ^ 2/a 


C' = 2ls, 


(400) 


Substituting (399) and (400) into the systems (394) we find that 
for ^ = 1, 2, 3, the first two equations are identically satisfied, and 
the other two are equivalent (if we suppose, and we shall, that 
Is 9 ^ I a) to while remain arbitrary. However, 

for t = 4 we find from the first two equations that = 0, 

and from the other two that Finally we have 


for i = 1, 2, 3: if‘ = + ^^); 

for ^ = 4: — ^J). 


The coefficients c contained in these formulas remain arbitrary, 
except for the normalization conditions. If we take (392') and 

The case /« ** /a corresponds to the case in which the exchange integral 
vanishes. (See footnote on page 461.) 
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(396) into account and put 

Si = c“si + + 2 c**S8, 

a = 2 c‘* 03 , 

the preceding expressions are written: 

= V2 as,, j ii = 1, 2, 3) (401) 

= V2 $a. I 

(Note that in the case Ui = n 2 , we have (t = 0, and hence the 
states corresponding to i = 1, 2, 3 are absent.) 

The first three eigenfunctions correspond (in our approximation) 
to the eigenvalue 

^nx + + 2/x, (402) 

and the fourth corresponds to the eigenvalue 

Eni + Em + 2/s. (402') 

In general, therefore, the original level Em + Em splits into a 
triplet leveP^ (402) and a singlet level (402'). For ni = rii the 
triplet level is absent. As may be seen from (401), the eigen¬ 
functions for the triplet are obtained by multiplying an anti¬ 
symmetric Schrodinger eigenfunction by a symmetric spin eigen¬ 
function, whereas the singlet ^ is the product of a symmetric 
Schrodinger eigenfunction and an antisymmetric spin function. 
The separation between the singlet and triplet levels is given 
by 2(18 — Ia), or, as is easily found with the aid of (399) and 
(390), by 

/n(^l)n(^2)S^nx(X2)^n.(Xl) (402") 

This is exactly the exchange integral defined in the preceding section, 
computed, however, by accounting only for the eigenfunctions in 
X, y, z without the spin factors. 

This result finds a meaningful expression in the vector model, in 
which two spins oriented in the same direction, or ^‘parallel" spins, 
are made to correspond to a symmetric spin eigenfunction, whereas 

If we take the forces due to the spins into account, this triplet level will 
in turn split into three singlet levels, except for the case where the position 
eigenfunction Cl represents a state with zero azimuthal quantum number 
(‘^s-state^0» in which the perturbation energy due to spin is zero, and the three 
levels of the triplet stay together. 
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an antisymmetric spin eigenfunction corresponds to spins in opposite 
directions, or “antiparaller^ spins. 

when Russell-Saunders coupling holds, it is legitimate to use the 
ordinary Schrodinger theory rather than the Dirac theory, with 
the following limitation (which expresses the Pauli principle) added: 
the eigenfunctions must be antisymmetric in the 'position coordinates 
only if the spins are parallel and must be symmetric if the spins are 
antiparallel. In the first case we have a triplet level (which the 
vector model interprets in terms of the three possible orientations 
which the total spin may have with respect to the orbital magnetic 
field), whereas in the second case (the resultant spin being zero) 
we can have only a singlet level. If then the two electrons are in 
“the same orbit,that is, if their orbital quantum numbers are 
equal (symbolically: rii = 712 ), their spins cannot align themselves 
parallel, because in that case the two electrons would have all 
their four quantum numbers equal—a condition prohibited by the 
Pauli principle. Hence for ni = n^ the triplet is absent, and only 
the singlet exists. 

A characteristic trait of this quantum theory of “exchangeis 
that the energy difference between the singlet and corresponding 
triplet level exists, as we have seen, even when the magnetic actions 
upon the spin are neglected, whereas according to the vector model 
the two states would differ only in their opposite spin orientations. 
Hence their energy difference would be due only to the interaction 
between their magnetic moments. The reason is that in the pre¬ 
ceding theory the spins, although not occurring in the expression of 

Consider the operator representing the square of the total spin, which is 
« (<r,0) + + (^^(1) 4. ^^(2))2 4. (^^(1) 4. ^^(2))2^ 

where are the operators corresponding to the spin components 

of the first electron (formed according to §45) and are those for 

the second electron. Applying o-* to a function (of the spin variables) which is 
antisymmetric (a) or symmetric (s), we find, respectively, 

** 0, <r*8 « Ss. 

This means, according to the fimdamental principle of quantum mechanics, 
that a measurement of the total spin cr would give zero in the first case, \/8 in 
the second case. (The vector model would give: o* « 0 for antiparallel spins, 
<r «■ 2 for parallel spins. The numerical deviation of the last value from -s/S is 
due to the already noted imperfection of the vector model; see, for instance, 
§57 of Part II). 
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the Hamiltonian, enter indirectly through the application of the 
Pauli principle, since they compel us to select the symmetric or 
antisymmetric (position) eigenfunction according to whether the 
spins are antiparallel or parallel. It can be shown^^ that this selec¬ 
tion is formally equivalent to assuming a very strong interaction 
energy between the spins, proportional to the cosine of the angle 
between them. Thus we see justification of the success which this 
modelistic hypothesis had in the interpretation of multiplet spectra 
before the advent of quantum mechanics. 

If we were to develop the calculations of this section by taking 
the symmetric functions ypa rather than the we should find, by an 
identical procedure, that with fixed rii and there is a triplet of 
eigenfunctions of the form = \/2 §St (^ = 1, 2, 3), and a singlet 
eigenfunction of the form = \/2 (Ja. That is to say, under this 
hypothesis the triplet levels would correspond to parallel spins; the 
singlet levels to antiparallel spins. Since d = 0 for ni = n 2 , the 
singlet state rather than the triplet state would be absent. As 
will be pointed out in the following section, the experimental 
results on helium and similar spectra contradict this hypothesis, 
thus fully confirming the antisymmetry principle. 

The preceding considerations have led us to divide the anti¬ 
symmetric states (with respect to an exchange of all the coordinates) 
into two classes, according to whether they are symmetric or anti¬ 
symmetric with respect to an exchange of the 'position coordinates 
only. The spectral terms of the first class constitute the triplet 
system’’; those of the second class, the ^‘singlet system.” We shall 
now prove a theorem analogous to that shown in §G2, namely, that 
the transitions between the two classes of states are forbidden, or 
that the two systems of spectral terms do not combine with each 
other. However, though the theorem of §62 is rigorously true, 
the theorem we are about to prove holds only in the approximation 
of this section, where the spin forces are neglected. Hence these 
transitions must not be considered to be absolutely impossible but 
only highly improbable. In fact, as we shall see below in connection 
with the helium spectrum, some of the spectral lines corresponding 
to these transitions have been observed, but with very weak inten¬ 
sity. The proof is entirely analogous to that of §62. Since the 
operator ^ does not involve the spin variables in the approximation 
No. 6 of the Bibliography, page 226. 
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considered here, it is necessarily symmetric with respect to the 
position coordinates. Hence from (364) it follows that if is 
symmetric (or antisymmetric) with respect to the position coordi¬ 
nates, d\l/ will still have the same property, and hence also + dt). 



67. The spectrum of the helium atom and analogous spectra. 

The above results find one of their most important application in 
the case of the helium atom. Long before the rise of quantum 
mechanics, it was observed that the spectral lines of (non-ionized) 
helium could be divided into two classes, for each of which a system 
of separate terms was found (see Fig. 48). This discovery led to 
the belief that there were two distinct species of atoms, to which the 
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names parhelium and orthohelium were given. The lines of 
parhelium are all singlets, whereas those of orthohelium when 
examined with instruments of high resolving power generally reveal 
a fine-structure, characteristic of a system of triplets.^® The ground 
level (which lies considerably lower than the others) belongs to the 
singlet system and corresponds to a Is-state. The corresponding 
term in the triplet system is missing. Later an intercombination 
line was found, corresponding to a transition from one of the triplet 
levels to one of the singlet levels (see the broken line in Fig. 48), 
but its rather weak intensity indicates that such transitions occur 
with very small probability. 

These experimental results are in excellent agreement with the 
theory presented in the preceding sections: the atoms of parhelium 
are those with antiparallel spins (that is, those whose ^ is anti¬ 
symmetric in the position coordinates), whereas the atoms of 
orthohelium are those with parallel spins (or with a yp symmetric in 
the position coordinates). In the ground state, both electrons are 
naturally in the lowest state, which is the Is-state. They must 
therefore possess antiparallel spins in order to obey the Pauli 
principle, and hence this state belongs to the singlet series and has 
no analogue among the triplets. (This conclusion represents the 
confirmation of the postulate mentioned in the last section, that 
the yp which actually occur are the antisymmetric ones, rather than 
the symmetric ones.) In the excited states, one of the electrons 
continues to occupy the 1 s orbit, while the other electron moves to 
higher orbits. Hence their spins may be oriented both parallel and 
antiparallel. 

We shall now outline the quantitative comparison between 
theory and experiment. This constitutes one of the most remark- 

The reason for these names lies in the former belief that the orbits of the 
two electrons were coplanar in parhelium and perpendicular in orthohelium. 
The names have persisted, although the model (which has never led to quantita¬ 
tively satisfactory results) has been abandoned. 

This does not mean that the terms of orthohelium are all triplets but 
rather that they are triplets with the exception of the S terms, which are always 
singlets (in this connection see a volume on spectroscopy, or No. 27 of the 
Bibliography). We might add that the triplets of orthohelium are very nar¬ 
row, so that many of them cannot be resolved or can be resolved only partially, 
so that they look like doublets. Their triplet nature is confirmed, however, by 
the analogous spectra (Li"^, C‘^+’^‘^), in which the separations are 

generally much larger. 
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able successes of quantum mechanics, especially because all previous 
attempts at theoretical interpretation on the basis of the Sommer- 
feld theory and others failed completely. Quantum mechanics con- 
tains the necessary elements for calculating not only the numerical 
values of the separate spectral terms but also all other physical 
constants of the helium atom, such as diamagnetic susceptibility, 
dielectric constant, van der Waals’ constant, and boiling point. 
However, these calculations are quite laborious, since methods of 
successive approximations must necessarily be applied, and have 
given rise to numerous works of a mathematical nature.^® We shall 
confine ourselves to enumerating some examples of the results. The 
quantity which has been calculated with the greatest care is the 
numerical value of the ground term (in other words, the ionization 
potential). These calculations have given 

198,308 cm-i 

as compared with the experimental result: 

198,298 ± 8 cm-i; 

the relative difference is 1 part in 20,000. The ionization potential 
turns out to be about 24.47 volts. For the excited states, the 
numerical calculation is more difficult and has not been pushed to 
such a high degree of accuracy, but it is always in satisfactory 
agreement with experiment. Concerning the effects of an electric 
or magnetic field, we restrict ourselves to citing the following 
results: 

Calculated Observed 

Magnetic susceptibility. —1.87 X 10”* —1.88 X 10~* 

Dielectric constant. 1.000071 1.000074 

As is to be expected, spectra analogous to the helium atom are 
emitted by the ions Li+, Be++, C++++, which, like the helium 

atom, have two electrons and differ from it only in the nucleus. 
Hence the formulas of He also apply to them after the numerical 
values have been changed. In these cases too, the agreement with 
experiment is excellent, as is shown by the following tabulation of 

An over«all presentation of these works, with bibliographical references, 
can be found in No. 1 of the Bibliography, from wliich the numerical data 
quoted here are taken. 
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the values of the ground terms in : 

Calcvlated Observed 

Li+ 609,985 610,090 ± 100 

Be++ 1,241,222 1,241,350 ± 200 

B+++ 2,091,770 2,092,000 ± 300 

C++++ 3,161,770 3,161,900 ± 800 

Finally, it should be added that the atoms of the second column 
of the periodic system, or alkaline earths, present a certain analogy 
to the helium atom as far as the criterion in §59 of Part II is con¬ 
cerned, when the two valence electrons are considered to be subject 
to the attraction of the nucleus diminished by the screening 
action of the remaining Z — 2 electrons (which form completed 
shells, and hence may to a good approximation be replaced by a 
static space charge; see footnote on page 253). Therefore the 
problem differs from the helium case, because the field in which the 
two electrons move is not Newtonian. In this case, of course, 
the theory may not be made to yield very exact quantitative pre¬ 
dictions, but its qualitative conclusions are confirmed by experi¬ 
ment. In fact, we still obtain a singlet system and a triplet system 
in these cases (the latter with considerably larger separations than 
in helium). The intercombinations between singlets and triplets 
are somewhat less improbable, however, as predicted by the theory 
and confirmed by observation. For this and other spectroscopic 
applications of quantum mechanics, see a text on spectroscopy. 
(See footnote, page 252.) 

68. Note concerning other applications. The theory of systems 
with equal particles presented in this chapter—in which the quan¬ 
tum mechanical concept of resonance or exchange explained in §65 is 
characteristic—has also been successfully applied to the study of 
the collisions of slow electrons with hydrogen or helium atoms. 
In the latter phenomenon the exchange between the colliding elec¬ 
tron and those of the atom influences considerably the angular 
distribution of the electrons after the collision, and this influence is 
revealed in the experimental curves. Another important appli¬ 
cation is the explanation, due to Heisenberg,^® of the origin of the 

See Mott and Massey, The Theory of Atomic Collisions, Oxford: Clarendon 
Press, 1949; or G. Wentzel, Wellenmechanik der Stoss- und Strahlungsvorgdnge, 
in Geiger and Scheel, Handbuch der Physik, XXIVi, 2nd edition, Berlin: 
J. Springer, 1933. 

« Zeits, /. Physik 49, 619 (1928). 
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so-called Weiss molecular field,” which is necessary to explain 
ferromagnetism but for which no satisfactory justification was found 
until Heisenberg’s explanation. Most important among the appli¬ 
cations of the exchange concept is the theory of the hydrogen 
molecule H 2 , due to Heitler and London, who for the first time gave 
a satisfactory explanation of the force holding the two hydrogen 
atoms together. This force is essentially due to the afore-mentioned 
phenomenon of exchange or resonance between the two electrons 
and is therefore called exchange force,^^ We confine ourselves here 
to pointing out that these '^exchange forces” generally play an 
important part in the interactions between neighboring atoms 
constituting a molecule or a crystal lattice, and presumably also 
in the interactions between the particles constituting the atomic 
nucleus. Furthermore, the physical theory of valence, introduced 
by Heitler and London for the case of the H 2 molecule, has been 
extended, at least qualitatively, to more complex molecules with 
considerable success, and seems capable of further development. 


Values of Some Fundamental Constants* 


(CCS Units) 

Velocity of light. 

Planck’s constant. 

Avogadro’s numberf. 

Faraday (ine.ni.u.)t. 

Electronic charge: 

In e.s.u. 

In e.m.u. 

Specific electronic charge: 

In e.s.u. 

In e.m.u. 

Electronic (rest) mass. 

Ergs corresponding to 1 electron-volt. 

Ratio of proton mass to electron mass. 

Proton mass. 

Bohr magneton. 

Fine-structure constant J. a 


c = 2.99776 

h = 6.6237 

N * 6.0235 

F = 9649.6 


e = 

4.8024 

e/c « 

1.60199 

ejm — 

5.2741 

e/mc — 

1.75936 

m =s 

9.1055 


1.60199 

Mpfm = 

1836.57 

Mp = 

1.67229 

Mo =* 

0.92736 

2xe2 


he ~ 

0.72969 


X 1010 
X 10-27 
X 1028 


X 10-10 
X 10-20 

X 1017 
X 107 
X 10-28 
X 10-12 

X 10-2* 
X 10-20 

X 10-8 


* All values are taken from J. W. M. DuMond and E. R. Cohen, Rev, Mod. 
Phys. 20, 82 (1948) and ibid., 21, 651 (1949). 

t The constants depending on the gram-molecular weight are here given on. 
the chemical scale (see §1, Part I, footnote 2). 
t R. T. Birge, Phys. Rev. 79, 193 (1950). 


For a presentation of this theory we refer the reader to the volume Molecole 
e cristalli (“Molecules and Crystals”) of the Italian physics series. 
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